Entering edit mode
I am undertaking analysis of four EPIC methylation arrays on version 4.2.2 of R using the minfi package.
However, whenever I run the following line of code I encounter the following error. Please note I have searched my sample sheet for any duplicate values but there are none.
If you could please help in any way or make informed suggestions for progression I could not be more grateful!
Thank you
# RGsetEx <- read.metharray.exp(targets = sheet)
# Error in read.metharray(basenames = files, extended = extended, verbose = verbose, :
!anyDuplicated(basenames) is not TRUE
sessionInfo( )
```R version 4.2.2 (2022-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)
Matrix products: default
locale:
[1] LC_COLLATE=English_United Kingdom.utf8 LC_CTYPE=English_United Kingdom.utf8
[3] LC_MONETARY=English_United Kingdom.utf8 LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.utf8
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods
[9] base
other attached packages:
[1] ggplot2_3.4.0
[2] circlize_0.4.15
[3] reshape2_1.4.4
[4] corpcor_1.6.10
[5] CpGassoc_2.60
[6] data.table_1.14.6
[7] qqman_0.1.8
[8] tidyr_1.2.1
[9] pvclust_2.2-0
[10] sqldf_0.4-11
[11] RSQLite_2.2.20
[12] gsubfn_0.7
[13] proto_1.0.0
[14] pcaMethods_1.90.0
[15] sva_3.46.0
[16] BiocParallel_1.32.5
[17] genefilter_1.80.3
[18] mgcv_1.8-41
[19] nlme_3.1-160
[20] dplyr_1.0.10
[21] limma_3.54.0
[22] WGCNA_1.72-1
[23] fastcluster_1.2.3
[24] dynamicTreeCut_1.63-1
[25] GO.db_3.16.0
[26] AnnotationDbi_1.60.0
[27] missMethyl_1.32.0
[28] IlluminaHumanMethylationEPICanno.ilm10b4.hg19_0.6.0
[29] MatrixEQTL_2.3
[30] IlluminaHumanMethylation450kanno.ilmn12.hg19_0.6.1
[31] IlluminaHumanMethylation450kmanifest_0.4.0
[32] FlowSorted.Blood.EPIC_2.2.0
[33] ExperimentHub_2.6.0
[34] AnnotationHub_3.6.0
[35] BiocFileCache_2.6.0
[36] dbplyr_2.3.0
[37] FlowSorted.Blood.450k_1.36.0
[38] IlluminaHumanMethylationEPICanno.ilm10b2.hg19_0.6.0
[39] IlluminaHumanMethylationEPICmanifest_0.3.0
[40] minfi_1.44.0
[41] bumphunter_1.40.0
[42] locfit_1.5-9.7
[43] iterators_1.0.14
[44] foreach_1.5.2
[45] Biostrings_2.66.0
[46] XVector_0.38.0
[47] SummarizedExperiment_1.28.0
[48] Biobase_2.58.0
[49] MatrixGenerics_1.10.0
[50] matrixStats_0.63.0
[51] GenomicRanges_1.50.2
[52] GenomeInfoDb_1.34.6
[53] IRanges_2.32.0
[54] S4Vectors_0.36.1
[55] BiocGenerics_0.44.0
loaded via a namespace (and not attached):
[1] utf8_1.2.2 tidyselect_1.2.0
[3] htmlwidgets_1.6.1 grid_4.2.2
[5] munsell_0.5.0 codetools_0.2-18
[7] preprocessCore_1.60.2 chron_2.3-58
[9] interp_1.1-3 statmod_1.5.0
[11] withr_2.5.0 colorspace_2.0-3
[13] filelock_1.0.2 knitr_1.41
[15] rstudioapi_0.14 GenomeInfoDbData_1.2.9
[17] bit64_4.0.5 rhdf5_2.42.0
[19] vctrs_0.5.1 generics_0.1.3
[21] xfun_0.36 R6_2.5.1
[23] doParallel_1.0.17 illuminaio_0.40.0
[25] bitops_1.0-7 rhdf5filters_1.10.0
[27] cachem_1.0.6 reshape_0.8.9
[29] DelayedArray_0.23.2 assertthat_0.2.1
[31] promises_1.2.0.1 BiocIO_1.8.0
[33] scales_1.2.1 nnet_7.3-18
[35] gtable_0.3.1 rlang_1.0.6
[37] calibrate_1.7.7 GlobalOptions_0.1.2
[39] splines_4.2.2 rtracklayer_1.58.0
[41] impute_1.72.3 GEOquery_2.66.0
[43] checkmate_2.1.0 BiocManager_1.30.19
[45] yaml_2.3.6 GenomicFeatures_1.50.3
[47] backports_1.4.1 httpuv_1.6.8
[49] Hmisc_4.7-2 tcltk_4.2.2
[51] tools_4.2.2 nor1mix_1.3-0
[53] ellipsis_0.3.2 RColorBrewer_1.1-3
[55] siggenes_1.72.0 Rcpp_1.0.9
[57] plyr_1.8.8 base64enc_0.1-3
[59] sparseMatrixStats_1.10.0 progress_1.2.2
[61] zlibbioc_1.44.0 purrr_1.0.1
[63] RCurl_1.98-1.9 prettyunits_1.1.1
[65] rpart_4.1.19 openssl_2.0.5
[67] deldir_1.0-6 cluster_2.1.4
[69] magrittr_2.0.3 hms_1.1.2
[71] mime_0.12 xtable_1.8-4
[73] XML_3.99-0.13 jpeg_0.1-10
[75] mclust_6.0.0 shape_1.4.6
[77] gridExtra_2.3 compiler_4.2.2
[79] biomaRt_2.54.0 tibble_3.1.8
[81] crayon_1.5.2 htmltools_0.5.4
[83] later_1.3.0 tzdb_0.3.0
[85] Formula_1.2-4 DBI_1.1.3
[87] MASS_7.3-58.1 rappdirs_0.3.3
[89] Matrix_1.5-1 readr_2.1.3
[91] cli_3.6.0 quadprog_1.5-8
[93] pkgconfig_2.0.3 GenomicAlignments_1.34.0
[95] foreign_0.8-83 xml2_1.3.3
[97] annotate_1.76.0 rngtools_1.5.2
[99] multtest_2.54.0 beanplot_1.3.1
[101] doRNG_1.8.6 scrime_1.3.5
[103] stringr_1.5.0 digest_0.6.31
[105] base64_2.0.1 htmlTable_2.4.1
[107] edgeR_3.40.2 DelayedMatrixStats_1.20.0
[109] restfulr_0.0.15 curl_5.0.0
[111] shiny_1.7.4 Rsamtools_2.14.0
[113] rjson_0.2.21 lifecycle_1.0.3
[115] Rhdf5lib_1.20.0 askpass_1.1
[117] fansi_1.0.3 pillar_1.8.1
[119] lattice_0.20-45 KEGGREST_1.38.0
[121] fastmap_1.1.0 httr_1.4.4
[123] survival_3.4-0 interactiveDisplayBase_1.36.0
[125] glue_1.6.2 png_0.1-8
[127] BiocVersion_3.16.0 bit_4.0.5
[129] stringi_1.7.12 HDF5Array_1.26.0
[131] blob_1.2.3 org.Hs.eg.db_3.16.0
[133] latticeExtra_0.6-30 memoise_2.0.1
Thank you James
The console actually reports there are 2 duplicates. However, when I use the conditional formatting approach across my excel sample sheet, no duplicates are detected (I have also tried with eye). Any suggestions?
The 'Basename' isn't something that exists in the SampleSheet.csv that is read in. As an example
The 'Basename' is generated from the sample sheet as well as the path for the sample sheet. You could load the sample sheet into Excel and look for duplicates, but it's a combination of the Sentrix_ID and Sentrix_Position that is being duplicated. And for whatever reason you have dups.
Thank you, you were correct there was a problem with the Sentrix_ID. The issue has now been resolved :)