Question

*** caught segfault *** , cause 'memory not mapped'

0

Entering edit mode

Seymoo • 0

@seymoo-12522

Last seen 12 months ago

Oslo

I am running affy::justRMA function on a computer cluster to make expressionset from HTA 2.0 CEL files.

The function runs fine when I run # part 1 of the code (see the attachment) , but I get the following error when I increase the number of CEL files to 700 in the # part 2 (Please see the output in the attachment)

![*** caught segfault ***
address 0x2b3583efa830, cause 'memory not mapped'][1]

Since I run this function on computer cluster I have 173GB of RAM and as you may see from the attachment R recognizes the memory ( memuse::Sys.meminfo() ). Since the code is running with no problem with less number of samples it either the function or R isseu with memory.

Any similar experiance? solution?
Is it possible to use frma package for HTA CEL files with a custom CDF from Brainarraylibrary(hta20hsentrezgcdf)

Best

Hossein

enter image description here

affycoretools FRMA affyio AffymetrixDataTestFiles affy • 1.5k views

ADD COMMENT • link updated 12 months ago by James W. MacDonald 65k • written 12 months ago by Seymoo • 0

score 0 · Answer 1 · 2023-03-30

0

Entering edit mode

James W. MacDonald 65k

@james-w-macdonald-5106

Last seen 1 hour ago

United States

You are unfortunately running out of RAM. As you note, you could possibly use frma, after generating frozen vectors using the frmaTools package.

ADD COMMENT • link 12 months ago James W. MacDonald 65k

score 0 · Answer 2 · 2023-03-31

0

Entering edit mode

Seymoo • 0

@seymoo-12522

Last seen 12 months ago

Oslo

Thanks for the reply james. What I find odd is that ony my local PC with 48GB of RAM I am able to run 611 HTA 2.0 CEL files. If I add 10 more then I R session aborts due to fatal error.

However, when I add only 20 more samples and run this on a cluster cumputer with 150 GB of RAM the memory is still an issue, which makes me think that there might be an issue with the justRMA function.

Are you aware if others have been able to successfully use justRMA with say >650 CEL files?

Also, do you think it is wise to yse for loop to read the CEL files with justRMA(normalize = FALSE) and do a normalization after all CELs have been background corrected?

Best

ADD COMMENT • link 12 months ago Seymoo • 0

0

Entering edit mode

If you are adding a comment, please use the ADD COMMENT button, rather than ADD ANSWER.

I think you are correct - the normal error for running out of RAM is something like 'Error: a vector of length XXX could not be allocated`

This isn't an issue with justRMA, but instead it's a C-level issue (both rma and justRMA use the same underlying C code, but justRMA skips expensive steps like instanciating an AffyBatch first). The C code for the affy package was written years ago by Ben Bolstad, who hasn't been around these parts for years now, and people don't really use Affy arrays these days, so it's hard to get much impetus for people to want to fix an ancient codebase that works for like 99.999% of the few remaining uses.

Also, the affy package was written back in the day when a given probe was only ever used for a single probeset, and for the later arrays like the HTA series Affy started sharing probes across multiple probesets. This was a problem for the affy package, and IIRC Ben Bolstad did something to patch it, but it wasn't really a priority because the oligo package didn't care about such things and was meant to replace affy anyway. It may be an edge case in the C code that comes up with too many arrays, but I don't know C, so am unable to be of any help for that.

Also also, the affy package wasn't ever intended to be used for HTA arrays, and even though MBNI has remapped the probes and made a CDF package, that's really an off-label use for affy, so I am not sure it's in anybody's interest to 'fix' a package for a use it was never intended.

All that said, you might consider just processing the data in two batches. With the number of arrays you are using, I wouldn't think there would be a batch effect, and even if there were you could adjust for that in your linear model.

ADD REPLY • link 12 months ago James W. MacDonald 65k