Hi! I'm trying to run preprocessENmix on a set of 1346 samples. I have previously run QCinfo on the RGChannelSetExtended and it works as expected.
# Read in RDS file
rgset <- readRDS("RGChannelSet_CVH.n1346.blood.RDS")
# # Run QC
# qc <- QCinfo(rgSet = rgset,distplot = F)
#
# # Save QC info
# saveRDS(qc, "qc_info.blood.RDS")
However, when I try to use the QCinfo object to perform preprocessENmix, I get stuck after a few hours. Any advice for how to resolve this error?
qc_info <- readRDS("qc_info.blood.RDS")
# Preprocess and quantile normalize
pre <- preprocessENmix(rgset, QCinfo = qc_info, nCores = 1)
saveRDS(pre,"MethylSet_ENmix_prepreprocessed.CVH.blood.RDS")
# Error in h(simpleError(msg, call)) :
# error in evaluating the argument 'x' in selecting a method for function 'colnames': incorrect number of dimensions
# Calls: preprocessENmix -> enmix -> colnames -> .handleSimpleError -> h
# Execution halted
I've check the class of the RGSet and QCinfo object, and they seem like they are correct. I looked at the colnames of "meth" in the preprocessENmix source code and it has the same length, 1346, as I would expect from the RGSet and the QCinfo object.
Here's the structure of each object:
str(rgset)
# Formal class 'RGChannelSetExtended' [package "minfi"] with 6 slots
# ..@ annotation : Named chr [1:2] "IlluminaHumanMethylationEPIC" "ilm10b4.hg19"
# .. ..- attr(*, "names")= chr [1:2] "array" "annotation"
# ..@ colData :Formal class 'DFrame' [package "S4Vectors"] with 6 slots
# .. .. ..@ rownames : chr [1:1346] "206375080017_R02C01" "206375080017_R04C01" "2063
75080017_R06C01" "206375080017_R07C01" ...
# .. .. ..@ nrows : int 1346
# .. .. ..@ elementType : chr "ANY"
# .. .. ..@ elementMetadata: NULL
# .. .. ..@ metadata : list()
# .. .. ..@ listData : Named list()
# ..@ assays :Formal class 'SimpleAssays' [package "SummarizedExperiment"] with 1 s
lot
# .. .. ..@ data:Formal class 'SimpleList' [package "S4Vectors"] with 4 slots
# .. .. .. .. ..@ listData :List of 5
# .. .. .. .. .. ..$ Green : int [1:1051943, 1:1346] 10493 6344 1921 5607 640 7454 10790 8
556 1181 2240 ...
# .. .. .. .. .. .. ..- attr(*, "dimnames")=List of 2
# .. .. .. .. .. .. .. ..$ : chr [1:1051943] "1600101" "1600111" "1600115" "1600123" ...
# .. .. .. .. .. .. .. ..$ : chr [1:1346] "206375080017_R02C01" "206375080017_R04C01" "2063
75080017_R06C01" "206375080017_R07C01" ...
# .. .. .. .. .. ..$ Red : int [1:1051943, 1:1346] 5330 4671 27302 5040 15729 1285 1874
2684 28226 20637 ...
# .. .. .. .. .. .. ..- attr(*, "dimnames")=List of 2
# .. .. .. .. .. .. .. ..$ : chr [1:1051943] "1600101" "1600111" "1600115" "1600123" ...
# .. .. .. .. .. .. .. ..$ : chr [1:1346] "206375080017_R02C01" "206375080017_R04C01" "2063
75080017_R06C01" "206375080017_R07C01" ...
# .. .. .. .. .. ..$ GreenSD: int [1:1051943, 1:1346] 662 374 373 632 210 901 1161 1184 240
729 ...
# .. .. .. .. .. .. ..- attr(*, "dimnames")=List of 2
# .. .. .. .. .. .. .. ..$ : chr [1:1051943] "1600101" "1600111" "1600115" "1600123" ...
# .. .. .. .. .. .. .. ..$ : chr [1:1346] "206375080017_R02C01" "206375080017_R04C01" "2063
75080017_R06C01" "206375080017_R07C01" ...
# .. .. .. .. .. ..$ RedSD : int [1:1051943, 1:1346] 1247 1080 2917 687 788 427 833 639 33
95 2184 ...
# .. .. .. .. .. .. ..- attr(*, "dimnames")=List of 2
# .. .. .. .. .. .. .. ..$ : chr [1:1051943] "1600101" "1600111" "1600115" "1600123" ...
# .. .. .. .. .. .. .. ..$ : chr [1:1346] "206375080017_R02C01" "206375080017_R04C01" "2063
75080017_R06C01" "206375080017_R07C01" ...
# .. .. .. .. .. ..$ NBeads : int [1:1051943, 1:1346] 8 8 10 13 10 12 19 25 17 11 ...
# .. .. .. .. .. .. ..- attr(*, "dimnames")=List of 2
# .. .. .. .. .. .. .. ..$ : chr [1:1051943] "1600101" "1600111" "1600115" "1600123" ...
# .. .. .. .. .. .. .. ..$ : chr [1:1346] "206375080017_R02C01" "206375080017_R04C01" "2063
75080017_R06C01" "206375080017_R07C01" ...
# .. .. .. .. ..@ elementType : chr "ANY"
# .. .. .. .. ..@ elementMetadata: NULL
# .. .. .. .. ..@ metadata : list()
# ..@ NAMES : chr [1:1051943] "1600101" "1600111" "1600115" "1600123" ...
# ..@ elementMetadata:Formal class 'DFrame' [package "S4Vectors"] with 6 slots
# .. .. ..@ rownames : NULL
# .. .. ..@ nrows : int 1051943
# .. .. ..@ elementType : chr "ANY"
# .. .. ..@ elementMetadata: NULL
# .. .. ..@ metadata : list()
# .. .. ..@ listData : Named list()
# ..@ metadata : list()
str(qc_info)
# List of 6
# $ detP : num [1:866238, 1:1346] 0.00 0.00 0.00 0.00 6.34e-271 ...
# ..- attr(*, "dimnames")=List of 2
# .. ..$ : chr [1:866238] "cg18478105" "cg09835024" "cg14361672" "cg01763666" ...
# .. ..$ : chr [1:1346] "206375080017_R02C01" "206375080017_R04C01" "206375080017_R06C01" "
206375080017_R07C01" ...
# $ nbead : num [1:866238, 1:1346] 10 8 9 15 2 9 11 7 7 5 ...
# ..- attr(*, "dimnames")=List of 2
# .. ..$ : chr [1:866238] "cg18478105" "cg09835024" "cg14361672" "cg01763666" ...
# .. ..$ : chr [1:1346] "206375080017_R02C01" "206375080017_R04C01" "206375080017_R06C01" "
206375080017_R07C01" ...
# $ bisul : Named num [1:1346] 19377 20302 21838 22428 20770 ...
# ..- attr(*, "names")= chr [1:1346] "206375080017_R02C01" "206375080017_R04C01" "206375080
017_R06C01" "206375080017_R07C01" ...
# $ badsample : chr [1:75] "206382230128_R07C01" "205841320037_R03C01" "205841330028_R0
4C01" "205841330028_R05C01" ...
# $ badCpG : chr [1:9369] "cg08795713" "cg11890956" "cg19453472" "cg14428027" ...
# $ outlier_sample: chr [1:65] "206375080030_R01C01" "206375080030_R02C01" "206375080030_R0
4C01" "206375080030_R06C01" ...
Here's the session info:
sessionInfo( )
# R version 4.2.1 (2022-06-23)
# Platform: x86_64-redhat-linux-gnu (64-bit)
# Running under: Springdale Linux 7.9 (Verona)
#
# Matrix products: default
# BLAS/LAPACK: /usr/lib64/libopenblaso-r0.3.2.so
#
# locale:
# [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
# [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
# [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
# [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
# [9] LC_ADDRESS=C LC_TELEPHONE=C
# [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#
# attached base packages:
# [1] parallel stats4 stats graphics grDevices utils datasets
# [8] methods base
#
# other attached packages:
# [1] IlluminaHumanMethylationEPICanno.ilm10b4.hg19_0.6.0
# [2] IlluminaHumanMethylationEPICmanifest_0.3.0
# [3] ENmix_1.32.0
# [4] doParallel_1.0.17
# [5] magrittr_2.0.3
# [6] minfi_1.42.0
# [7] bumphunter_1.38.0
# [8] locfit_1.5-9.7
# [9] iterators_1.0.14
# [10] foreach_1.5.2
# [11] Biostrings_2.64.1
# [12] XVector_0.36.0
# [13] SummarizedExperiment_1.26.1
# [14] Biobase_2.56.0
# [15] MatrixGenerics_1.8.1
# [16] matrixStats_0.63.0
# [17] GenomicRanges_1.48.0
# [18] GenomeInfoDb_1.32.4
# [19] IRanges_2.30.1
# [20] S4Vectors_0.34.0
# [21] BiocGenerics_0.42.0
# [22] boot_1.3-28.1
#
# loaded via a namespace (and not attached):
# [1] AnnotationHub_3.4.0 BiocFileCache_2.4.0
# [3] plyr_1.8.8 splines_4.2.1
# [5] BiocParallel_1.30.4 digest_0.6.31
# [7] htmltools_0.5.4 RPMM_1.25
# [9] fansi_1.0.4 memoise_2.0.1
# [11] cluster_2.1.3 tzdb_0.3.0
# [13] limma_3.52.4 readr_2.1.4
# [15] annotate_1.74.0 askpass_1.1
# [17] siggenes_1.70.0 prettyunits_1.1.1
# [19] blob_1.2.3 rappdirs_0.3.3
# [21] dplyr_1.1.0 crayon_1.5.2
# [23] RCurl_1.98-1.10 genefilter_1.78.0
# [25] GEOquery_2.64.2 impute_1.70.0
# [27] survival_3.3-1 glue_1.6.2
# [29] zlibbioc_1.42.0 DelayedArray_0.22.0
# [31] Rhdf5lib_1.18.2 HDF5Array_1.24.2
# [33] DBI_1.1.3 rngtools_1.5.2
# [35] Rcpp_1.0.10 xtable_1.8-4
# [37] progress_1.2.2 bit_4.0.5
# [39] mclust_6.0.0 preprocessCore_1.58.0
# [41] httr_1.4.4 gplots_3.1.3
# [43] RColorBrewer_1.1-3 ellipsis_0.3.2
# [45] pkgconfig_2.0.3 reshape_0.8.9
# [47] XML_3.99-0.13 dbplyr_2.3.0
# [49] utf8_1.2.3 dynamicTreeCut_1.63-1
# [51] tidyselect_1.2.0 rlang_1.0.6
# [53] later_1.3.0 AnnotationDbi_1.58.0
# [55] BiocVersion_3.15.2 tools_4.2.1
# [57] cachem_1.0.6 cli_3.6.0
# [59] generics_0.1.3 RSQLite_2.3.0
# [61] ExperimentHub_2.4.0 stringr_1.5.0
# [63] fastmap_1.1.0 yaml_2.3.7
# [65] bit64_4.0.5 beanplot_1.3.1
# [67] caTools_1.18.2 scrime_1.3.5
# [69] purrr_1.0.1 KEGGREST_1.36.3
# [71] nlme_3.1-157 doRNG_1.8.6
# [73] sparseMatrixStats_1.8.0 mime_0.12
# [75] nor1mix_1.3-0 xml2_1.3.3
# [77] biomaRt_2.52.0 compiler_4.2.1
# [79] rstudioapi_0.14 filelock_1.0.2
# [81] curl_5.0.0 png_0.1-8
# [83] interactiveDisplayBase_1.34.0 tibble_3.1.8
# [85] geneplotter_1.74.0 stringi_1.7.12
# [87] GenomicFeatures_1.48.4 lattice_0.20-45
# [89] Matrix_1.5-3 multtest_2.52.0
# [91] vctrs_0.5.2 pillar_1.8.1
# [93] lifecycle_1.0.3 rhdf5filters_1.8.0
# [95] BiocManager_1.30.19 data.table_1.14.8
# [97] bitops_1.0-7 httpuv_1.6.9
# [99] rtracklayer_1.56.1 R6_2.5.1
# [101] BiocIO_1.6.0 promises_1.2.0.1
# [103] KernSmooth_2.23-20 codetools_0.2-18
# [105] pkgload_1.3.2 gtools_3.9.4
# [107] MASS_7.3-57 assertthat_0.2.1
# [109] rhdf5_2.40.0 openssl_2.0.5
# [111] rjson_0.2.21 GenomicAlignments_1.32.1
# [113] Rsamtools_2.12.0 GenomeInfoDbData_1.2.8
# [115] hms_1.1.2 quadprog_1.5-8
# [117] grid_4.2.1 tidyr_1.3.0
# [119] base64_2.0.1 DelayedMatrixStats_1.18.2
# [121] illuminaio_0.38.0 shiny_1.7.4
# [123] restfulr_0.0.15
Can you access idat raw files, if you do, try to use ENmix function readidat() to create rgDataSet as input for function preprocessENmix() instead of using minfi data object for ENmix function. I tested minfi data object for ENmix a few years ago. But if something changed in minfi, ENmix may or may not handle it properly.
Thank you! I've successfully run the function on a subset of the data, so I don't think this is the issue.
We ended up trying this and it avoided the error! Thank you for the suggestion.