Question: read.metharray() not recognizing some of my EPIC arrays; force = TRUE not working either
2.5 years ago by
Jenny Drnevich1.9k
United States
Jenny Drnevich1.9k wrote:


I'm trying to use minfi's read.metharray.exp() / read.metharray() to read in 48 samples from 6 Illumina EPIC arrays. The arrays were all ordered at the same time from Illumina, but appear to come from two different main batches: two arrays have Sentrix_IDs 20111453005X and the other four arrays have IDs 2013888700XX. When trying to read them in, I get the folllowing error:

> rgSet <- read.metharray.exp(targets=targets, force = TRUE)
[read.metharray] Trying to parse IDAT files from different arrays.
  Inferred Array sizes and types:
                    array                          size     
201114530050_R05C01 "IlluminaHumanMethylationEPIC" "1052641"
201114530050_R06C01 "IlluminaHumanMethylationEPIC" "1052641"
201114530050_R07C01 "IlluminaHumanMethylationEPIC" "1052641"
201114530050_R08C01 "IlluminaHumanMethylationEPIC" "1052641"
201114530051_R01C01 "IlluminaHumanMethylationEPIC" "1052641"
201114530051_R02C01 "IlluminaHumanMethylationEPIC" "1052641"
201114530051_R03C01 "IlluminaHumanMethylationEPIC" "1052641"
201114530051_R04C01 "IlluminaHumanMethylationEPIC" "1052641"
201388870032_R05C01 "Unknown"                      "1051943"
201388870032_R06C01 "Unknown"                      "1051943"
201388870032_R07C01 "Unknown"                      "1051943"
201388870032_R08C01 "Unknown"                      "1051943"
201388870033_R01C01 "Unknown"                      "1051943"
201388870033_R02C01 "Unknown"                      "1051943"
201388870033_R03C01 "Unknown"                      "1051943"
201388870033_R04C01 "Unknown"                      "1051943"
201388870035_R05C01 "Unknown"                      "1051943"
201388870035_R06C01 "Unknown"                      "1051943"
201388870035_R07C01 "Unknown"                      "1051943"
201388870035_R08C01 "Unknown"                      "1051943"
201388870055_R01C01 "Unknown"                      "1051943"
201388870055_R02C01 "Unknown"                      "1051943"
201388870055_R03C01 "Unknown"                      "1051943"
201114530050_R01C01 "IlluminaHumanMethylationEPIC" "1052641"
201114530050_R02C01 "IlluminaHumanMethylationEPIC" "1052641"
201114530050_R03C01 "IlluminaHumanMethylationEPIC" "1052641"
201114530050_R04C01 "IlluminaHumanMethylationEPIC" "1052641"
201114530051_R05C01 "IlluminaHumanMethylationEPIC" "1052641"
201114530051_R06C01 "IlluminaHumanMethylationEPIC" "1052641"
201114530051_R07C01 "IlluminaHumanMethylationEPIC" "1052641"
201114530051_R08C01 "IlluminaHumanMethylationEPIC" "1052641"
201388870032_R01C01 "Unknown"                      "1051943"
201388870032_R02C01 "Unknown"                      "1051943"
201388870032_R03C01 "Unknown"                      "1051943"
201388870032_R04C01 "Unknown"                      "1051943"
201388870033_R05C01 "Unknown"                      "1051943"
201388870033_R06C01 "Unknown"                      "1051943"
201388870033_R07C01 "Unknown"                      "1051943"
201388870033_R08C01 "Unknown"                      "1051943"
201388870035_R01C01 "Unknown"                      "1051943"
201388870035_R02C01 "Unknown"                      "1051943"
201388870035_R03C01 "Unknown"                      "1051943"
201388870035_R04C01 "Unknown"                      "1051943"
201388870055_R04C01 "Unknown"                      "1051943"
201388870055_R05C01 "Unknown"                      "1051943"
201388870055_R06C01 "Unknown"                      "1051943"
201388870055_R07C01 "Unknown"                      "1051943"
201388870055_R08C01 "Unknown"                      "1051943"
Error in read.metharray(files, extended = extended, verbose = verbose,  : 
  [read.metharray] Trying to parse different IDAT files, of different size and type.

Looking at the help for ?read.metharray I see that I have another case of arrays with different numbers of probes. However, I can't use the force = TRUE argument as one set of my arrays doesn't even get recognized as "IlluminaHumanMethylationEPIC" because it's number of probes, 1051943 is just under the 1052000 threshold in the internal .guessArrayTypes().

I tried overcoming this by calling debugonce(read.metharray), and manually setting arrayTypes[,1] <- "IlluminaHumanMethylationEPIC" and arrayTypes[,2] <- "ilm10b2.hg19" but I'm still missing something because when I try to step through the rest of the function, I get an error: 

Error in `sampleNames<-`(`*tmp*`, value = c("1600101", "1600111", "1600115",  : number of new names (1052641) should equal number of rows in AnnotatedDataFrame (1051943)

So I'm stuck for now. Help please!




> sessionInfo()
R version 3.3.3 (2017-03-06)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
 [1] splines   grid      stats4    parallel  stats     graphics  grDevices utils     datasets 
[10] methods   base     

other attached packages:
 [1] stringr_1.2.0                                      
 [2] DMRcate_1.10.8                                     
 [3] DMRcatedata_1.10.1                                 
 [4] DSS_2.14.0                                         
 [5] bsseq_1.10.0                                       
 [6] Gviz_1.18.2                                        
 [7] minfiData_0.20.0                                   
 [8] IlluminaHumanMethylation450kanno.ilmn12.hg19_0.6.0 
 [9] IlluminaHumanMethylation450kmanifest_0.4.0         
[10] matrixStats_0.51.0                                 
[11] missMethyl_1.8.0                                   
[12] RColorBrewer_1.1-2                                 
[13] IlluminaHumanMethylationEPICmanifest_0.3.0         
[14] IlluminaHumanMethylationEPICanno.ilm10b2.hg19_0.6.0
[15] minfi_1.20.2                                       
[16] bumphunter_1.14.0                                  
[17] locfit_1.5-9.1                                     
[18] iterators_1.0.8                                    
[19] foreach_1.4.3                                      
[20] Biostrings_2.42.1                                  
[21] XVector_0.14.1                                     
[22] SummarizedExperiment_1.4.0                         
[23] GenomicRanges_1.26.4                               
[24] GenomeInfoDb_1.10.3                                
[25] IRanges_2.8.2                                      
[26] S4Vectors_0.12.2                                   
[27] Biobase_2.34.0                                     
[28] BiocGenerics_0.20.0                                
[29] limma_3.30.13                                      

loaded via a namespace (and not attached):
 [1] colorspace_1.3-2              siggenes_1.48.0               mclust_5.2.3                 
 [4] biovizBase_1.22.0             htmlTable_1.9                 base64enc_0.1-3              
 [7] dichromat_2.0-0               base64_2.0                    interactiveDisplayBase_1.12.0
[10] AnnotationDbi_1.36.2          R.methodsS3_1.7.1             codetools_0.2-15             
[13] methylumi_2.20.0              knitr_1.15.1                  Formula_1.2-1                
[16] Rsamtools_1.26.1              annotate_1.52.1               cluster_2.0.5                
[19] GO.db_3.4.0                   R.oo_1.21.0                   shiny_1.0.0                  
[22] httr_1.2.1                    backports_1.0.5               assertthat_0.1               
[25] Matrix_1.2-8                  lazyeval_0.2.0                acepack_1.4.1                
[28] htmltools_0.3.5               tools_3.3.3                   gtable_0.2.0                 
[31] doRNG_1.6                     Rcpp_0.12.10                  multtest_2.30.0              
[34] preprocessCore_1.36.0         nlme_3.1-131                  rtracklayer_1.34.2           
[37] mime_0.5                      ensembldb_1.6.2               rngtools_1.2.4               
[40] gtools_3.5.0                  statmod_1.4.29                XML_3.98-1.5                 
[43] beanplot_1.2                    AnnotationHub_2.6.5          
[46] zlibbioc_1.20.0               MASS_7.3-45                   scales_0.4.1                 
[49] BSgenome_1.42.0               VariantAnnotation_1.20.3      BiocInstaller_1.24.0         
[52] GEOquery_2.40.0               yaml_2.1.14                   memoise_1.0.0                
[55] gridExtra_2.2.1               ggplot2_2.2.1                 pkgmaker_0.22                
[58] biomaRt_2.30.0                rpart_4.1-10                  reshape_0.8.6                
[61] latticeExtra_0.6-28           stringi_1.1.3                 RSQLite_1.1-2                
[64] genefilter_1.56.0             permute_0.9-4                 checkmate_1.8.2              
[67] GenomicFeatures_1.26.3        BiocParallel_1.8.1            bitops_1.0-6                 
[70] nor1mix_1.2-2                 lattice_0.20-34               ruv_0.9.6                    
[73] GenomicAlignments_1.10.1      htmlwidgets_0.8               plyr_1.8.4                   
[76] magrittr_1.5                  R6_2.2.0                      Hmisc_4.0-2                  
[79] DBI_0.6                       foreign_0.8-67                survival_2.41-2              
[82] RCurl_1.95-4.8                nnet_7.3-12                   tibble_1.2                   
[85] data.table_1.10.4             digest_0.6.12                 xtable_1.8-2                 
[88] httpuv_1.3.3                  illuminaio_0.16.0             R.utils_2.5.0                
[91] openssl_0.9.6                 munsell_0.4.3                 registry_0.3                 
[94] BiasedUrn_1.07                quadprog_1.5-5 


I'm getting the same error, but mine is probably because I had one swath in one sample that would not scan. Any resolution so far? Thanks, Adrienne

> RGSet = read.metharray.exp(targets = targets)
Error in read.metharray(files, extended = extended, verbose = verbose,  :
  [read.metharray] Trying to parse IDAT files with different array size but seemingly all of the same type.
  You can force this by 'force=TRUE', see the man page ?read.metharray
> RGSet = read.metharray.exp(targets = targets, force = T)
Error in `sampleNames<-`(`*tmp*`, value = c("1600101", "1600111", "1600115",  :
  number of new names (1051943) should equal number of rows in AnnotatedDataFrame (1051539)

R version 3.3.1 (2016-06-21)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11.6 (El Capitan)

FYI, I updated R and RStudio to get the latest version of minfi. That worked for me.

