Hello,
I ran into a problem with the background normalization of clariom D arrays
> dat <- read.celfiles(list.celfiles())
> dat
HTAFeatureSet (storageMode: lockedEnvironment)
assayData: 6892960 features, 18 samples
element names: exprs
protocolData
rowNames: P_ADMSC_C1.CEL P_ADMSC_C2.CEL ... P_FB.CEL (18 total)
varLabels: exprs dates
varMetadata: labelDescription channel
phenoData
rowNames: P_ADMSC_C1.CEL P_ADMSC_C2.CEL ... P_FB.CEL (18 total)
varLabels: index
varMetadata: labelDescription channel
featureData: none
experimentData: use 'experimentData(object)'
Annotation: pd.clariom.d.human
If not activating the background correction then everything is fine:
>
> eset
ExpressionSet (storageMode: lockedEnvironment)
assayData: 138745 features, 18 samples
element names: exprs
protocolData
rowNames: P_ADMSC_C1.CEL P_ADMSC_C2.CEL ... P_FB.CEL (18 total)
varLabels: exprs dates
varMetadata: labelDescription channel
phenoData
rowNames: P_ADMSC_C1.CEL P_ADMSC_C2.CEL ... P_FB.CEL (18 total)
varLabels: index
varMetadata: labelDescription channel
featureData: none
experimentData: use 'experimentData(object)'
Annotation: pd.clariom.d.human
but when correcting the background, I get this crash:
> eset.bg.normalized <- rma(dat,target="core",background=T,normalize=T)
Background correcting
*** caught segfault ***
address 0x562729e52000, cause 'memory not mapped'
Traceback:
1: basicRMA(pms, pnVec, normalize, background)
2: .local(object, ...)
3: rma(dat, target = "core", background = T, normalize = T)
4: rma(dat, target = "core", background = T, normalize = T)
Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
Selection: 1
R is aborting now ...
Segmentation fault (core dumped)
Bisecting through the input data, I have now found the culprit .CEL file. With that one excluded, all remaining 17 are background-normalizing and I can trigger the crash with that single .CEL file.
What shall I do to chase (or to help others chasing) this up?
There are no infinite values (tested as seen suggested for rma of oligo feature set crashes R.
Many thanks! Steffen
My session:
R version 4.2.2 (2022-10-31)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)
Matrix products: default
BLAS/LAPACK: /home/sm718/miniconda3/lib/libmkl_rt.so
locale:
[1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
[5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8
[7] LC_PAPER=en_GB.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] pd.clariom.d.human_3.14.1 DBI_1.1.3
[3] RSQLite_2.2.20 oligo_1.62.2
[5] Biostrings_2.66.0 GenomeInfoDb_1.34.8
[7] XVector_0.38.0 IRanges_2.32.0
[9] S4Vectors_0.36.0 Biobase_2.58.0
[11] oligoClasses_1.60.0 BiocGenerics_0.44.0
loaded via a namespace (and not attached):
[1] Rcpp_1.0.10 compiler_4.2.2
[3] BiocManager_1.30.19 MatrixGenerics_1.10.0
[5] bitops_1.0-7 iterators_1.0.14
[7] tools_4.2.2 zlibbioc_1.44.0
[9] bit_4.0.5 memoise_2.0.1
[11] preprocessCore_1.60.2 lattice_0.20-45
[13] ff_4.0.9 pkgconfig_2.0.3
[15] rlang_1.0.6 Matrix_1.5-3
[17] foreach_1.5.2 cli_3.6.0
[19] DelayedArray_0.24.0 fastmap_1.1.0
[21] GenomeInfoDbData_1.2.9 affxparser_1.70.0
[23] vctrs_0.5.2 bit64_4.0.5
[25] grid_4.2.2 blob_1.2.3
[27] codetools_0.2-19 matrixStats_0.63.0
[29] GenomicRanges_1.50.0 splines_4.2.2
[31] SummarizedExperiment_1.28.0 RCurl_1.98-1.10
[33] cachem_1.0.6 crayon_1.5.2
[35] affyio_1.68.0
I forgot this...
I tried the same code on another local project of ours (with 52 files) and it worked flawlessly, just like it did for the remaining 17 files of the same project. I obviously cannot exclude the possibility that there is a problem with that one file, I just do not have another one, and would happily send it to you, if you allow - 24M gzipped. The normalization without background correction worked, so I have some hope left that it is not a complete disaster.
Thank you tons.
Steffen
I don't think it's a bug (the code for processing the arrays was written back in the early 2000's by Ben Bolstad and has been used approximately a gazillion times since then), so there's really only two choices.
Process without using background correction (something that Ben actually argued for, way back in the day).
Remove that sample
Or alternatively I suppose you could use bgversion = 1 and see if that helps. Trying to diagnose the problem for one array is probably not useful - it's obviously that one array, and it's not clear you could fix it anyway - so I would make a choice and go forward.
Ok, let's leave it here. Thank you.