0
Entering edit mode
shweta • 0
@shweta-24797
Last seen 4 days ago

Hello,

I would like to perform methylation data analysis and have data from both 450K and EPIC. I created a csv file containing the sentrix ID, sentrix position and basename which contains path to the IDAT files which are all in the same folder (including the csv).

library(methylationArrayAnalysis)
library(knitr)
library(limma)
library(minfi)
library(IlluminaHumanMethylation450kanno.ilmn12.hg19)
library(IlluminaHumanMethylation450kmanifest)
library(RColorBrewer)
library(missMethyl)
library(minfiData)
library(Gviz)
library(DMRcate)
library(stringr)
library(IlluminaHumanMethylationEPICanno.ilm10b4.hg19)
library(conumee)

dataDirectory <- "C:/Users/35389/Desktop/Medullos/All_combined"
target
target$Basename g_files <- paste0(target$Basename, "_Grn.idat")
all(file.exists(g_files))


# read in the sample sheet for the experiment
rgset <- read.metharray.exp(targets = target, recursive = TRUE, verbose = TRUE, extended = TRUE)


However, when I read in the files using read.metharray.exp() I get the following error

Timing stopped at: 0.14 0.05 0.22 Error in readIDAT(xx) : Cannot read IDAT file. File format error. Unknown magic:

Any help will be greatly appreciated! Thanks in advance :)

MethylationArrayData methy Bioconductor minfi • 114 views
1
Entering edit mode
@james-w-macdonald-5106
Last seen 1 day ago
United States

The error you see is because one or more of your files is problematic. The first thing readIDAT does is to check to see if the files are IDAT files, and you have at least one that seems not to be. You can figure this out for yourself; here's an example using some data I have in hand.

 targets <- read.metharray.sheet("../data/image_data/")
grn <- paste0(targets$Basename, "_Grn.idat") red <- paste0(targets$Basename, "_Red.idat")
> testGrn <- sapply(grn, readChar, nchars = 4)
> testRed <- sapply(red, readChar, nchars = 4)
> testGrn[testGrn != "IDAT"]
named character(0)
> testGrn[testRed != "IDAT"]
named character(0)
../data/image_data/203219730010/203219730010_R01C01_Grn.idat
"IDAT"
../data/image_data/203219730010/203219730010_R02C01_Grn.idat
"IDAT"
../data/image_data/203219730010/203219730010_R03C01_Grn.idat
"IDAT"
../data/image_data/203219730010/203219730010_R04C01_Grn.idat
"IDAT"
../data/image_data/203219730010/203219730010_R05C01_Grn.idat
"IDAT"
../data/image_data/203219730010/203219730010_R06C01_Grn.idat
"IDAT"
../data/image_data/203219730010/203219730010_R01C01_Red.idat
"IDAT"
../data/image_data/203219730010/203219730010_R02C01_Red.idat
"IDAT"
../data/image_data/203219730010/203219730010_R03C01_Red.idat
"IDAT"
../data/image_data/203219730010/203219730010_R04C01_Red.idat
"IDAT"
../data/image_data/203219730010/203219730010_R05C01_Red.idat
"IDAT"
../data/image_data/203219730010/203219730010_R06C01_Red.idat
"IDAT"


Presumably your data will return one or more problematic files which you can then either exclude or figure out what the problem is.

0
Entering edit mode

Thank you so much, this helped spot the incorrect files. I read in my EPIC files and 450 K files separately, and they were also of different sizes so I had to force them to be read

# read in the sample sheet for the experiment
rgset <- read.metharray.exp(targets = target_EPIC, recursive = TRUE, verbose = TRUE, extended = TRUE, force = TRUE)


As I understood it will merge on the basis of probes in the smallest files. But this is very very few probes quantified. I wonder what could be the reason for this and if you have any suggestions to get around it. Thanks a lot again! :)

0
Entering edit mode

You should read the data in separately and then use combineArrays. I have never done that sort of thing, so it's up to you to figure out if you should completely process the data to a GenomicRatioSet and then combine, or combine first and then process.