Dear list,
I came across a specific issue when reading Illumina DNA Methylation EPIC data using minfi/illuminaio. Which I could trace down to the problem of different bead locations on different chips (see output below). This has consequences for reading data using minfi (and probably other packages as well). Since, minfi, if I'm correct, assumes, featureNames (beadlocations) are the same across all idat files that are read at once.
Have people seen this before and/or how should I solve this? One ad-hoc solution I could think of is reading each chip separately and combining afterwards. A more solid solution could be reordering of bead locations of each channel e.g. in illuminaio? But this will have an effect on reading efficiency and will not always be necessary.
Regards,
Maarten van Iterson
> library(illuminaio)
> idats <- dir(pattern="idat")
> beadlocations <- simplify2array(lapply(idats, function(idat) as.integer(rownames(readIDAT(idat)$Quants))))
> colnames(beadlocations) <- basename(idats)
> head(beadlocations) ##first six bead locations of two samples from different chips
200325570026_R01C01_Grn.idat 200325570026_R01C01_Red.idat
[1,] 1600101 1600101
[2,] 1600111 1600111
[3,] 1600115 1600115
[4,] 1600123 1600123
[5,] 1600131 1600131
[6,] 1600135 1600135
200705860049_R01C01_Grn.idat 200705860049_R01C01_Red.idat
[1,] 1600101 1600101
[2,] 1600111 1600111
[3,] 1600115 1600115
[4,] 1600123 1600123
[5,] 1600131 1600131
[6,] 1600135 1600135
> tail(beadlocations) ##last six bead locations show the differences
200325570026_R01C01_Grn.idat 200325570026_R01C01_Red.idat
[1052636,] 40770147 40770147
[1052637,] 45656561 45656561
[1052638,] 71631919 71631919
[1052639,] 70779866 70779866
[1052640,] 60642867 60642867
[1052641,] 71726553 71726553
200705860049_R01C01_Grn.idat 200705860049_R01C01_Red.idat
[1052636,] 99810956 99810956
[1052637,] 99810958 99810958
[1052638,] 99810970 99810970
[1052639,] 99810978 99810978
[1052640,] 99810990 99810990
[1052641,] 99810992 99810992
> sum(beadlocations[,1] != beadlocations[,2])
[1] 0
> sum(beadlocations[,1] != beadlocations[,3]) ##but there are many more differences
[1] 1052502
> sessionInfo()
R Under development (unstable) (2016-03-21 r70361)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.4 LTS
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] illuminaio_0.14.0
loaded via a namespace (and not attached):
[1] base64_2.0 openssl_0.9.4