Question

Error using read.metharray.exp in minfi to read idat files

0

Entering edit mode

poojitha.stemcell ▴ 10

@poojithastemcell-11859

Last seen 3.0 years ago

United Kingdom

Hi,

I have been using minfi to process Illumina 450K. I have never faced any issues with reading in IDAT files using read.metharray.exp() function with a valid samplesheet input that includes a column 'Basename' containing the barcodes of IDAT files. However, when I tried to do the same today, I am repeatedly getting this error message that states none of the green idats are recognisable (shown below). I have carefully checked the format of my samplesheet and the idat files in my directory. I restarted R and freshly installed minfi. Still no luck. This function worked for me even yesterday. I am not sure why I am facing this problem now. Does anyone else have the same problem or can think of a solution to resolve this problem?

Thank you.

Regards,

Poojitha

> samplesheet_blood<-read.csv("samplesheet_blood.csv",header=TRUE,sep=",")

> RGSet_blood<-read.metharray.exp(targets = samplesheet_blood, extended = extended)
Error in read.metharray(files, extended = extended, verbose = verbose, :
The following specified files do not exist:5684819001_RO1CO1_Grn.idat, 5684819001_RO2CO1_Grn.idat, 5684819001_RO3CO1_Grn.idat, 5727920027_RO1CO1_Grn.idat, 5727920027_RO2CO1_Grn.idat, 5727920027_RO3CO1_Grn.idat, 5684819001_RO4CO1_Grn.idat, 5684819001_RO5CO1_Grn.idat, 5684819001_RO6CO1_Grn.idat, 5727920027_RO4CO1_Grn.idat, 5727920027_RO5CO1_Grn.idat, 5727920027_RO6CO1_Grn.idat, 5684819001_RO1CO2_Grn.idat, 5684819001_RO2CO2_Grn.idat, 5684819001_RO3CO2_Grn.idat, 5727920027_RO1CO2_Grn.idat, 5727920027_RO2CO2_Grn.idat, 5727920027_RO3CO2_Grn.idat, 5684819001_RO4CO2_Grn.idat, 5684819001_RO5CO2_Grn.idat, 5684819001_RO6CO2_Grn.idat, 5684819004_RO1CO1_Grn.idat, 5684819004_RO2CO1_Grn.idat, 5684819004_RO3CO1_Grn.idat, 5684819004_RO4CO1_Grn.idat, 5684819004_RO5CO1_Grn.idat, 5684819004_RO6CO1_Grn.idat, 5727920033_RO4CO1_Grn.idat, 5727920033_RO5CO1_Grn.idat, 5727920033_RO6CO1_Grn.idat, 5684819004_RO1CO2_Grn.idat, 5684819004_RO2CO2_Grn.idat, 5684819004_RO3CO2_Grn.idat, 5727920033_RO1CO2_Grn.

minfi read.metharray.exp idat HM450k • 5.1k views

ADD COMMENT • link 6.5 years ago poojitha.stemcell ▴ 10

score 0 · Answer 1 · 2017-10-24

0

Entering edit mode

Andy91 ▴ 60

@andy91-8905

Last seen 2.4 years ago

Netherlands

One oddity I observe in your output is that the zeros of the plate position have been replaced with an "O". This does not appear to be the case for the plate IDs.

ADD COMMENT • link 6.5 years ago Andy91 ▴ 60

0

Entering edit mode

I get the same error even after correcting it all to zeros.

ADD REPLY • link 6.5 years ago poojitha.stemcell ▴ 10

0

Entering edit mode

Could you show the directory structure of the location where all your .idat files are located? Are they located in subfolders, or are they all located in the same folder? How does your samplesheet look like?

ADD REPLY • link 6.5 years ago Andy91 ▴ 60

score 0 · Answer 2 · 2017-10-25

0

Entering edit mode

poojitha.stemcell ▴ 10

@poojithastemcell-11859

Last seen 3.0 years ago

United Kingdom

My Samplesheet looks like this:

> head(samplesheet_blood)
SampleName cell_type chip.ID chip.C0.position chip.Row.Pos Basename
1 WB 105 Whole blood 5684819001 C01 R01 5684819001_R01C01
2 WB 218 Whole blood 5684819001 C01 R02 5684819001_R02C01
3 WB 261 Whole blood 5684819001 C01 R03 5684819001_R03C01
4 WB 043 Whole blood 5727920027 C01 R01 5727920027_R01C01
5 WB 160 Whole blood 5727920027 C01 R02 5727920027_R02C01
6 WB 149 Whole blood 5727920027 C01 R03 5727920027_R03C01

My IDAT files are in the directory and these are the only files in my directory.

ADD COMMENT • link 6.5 years ago poojitha.stemcell ▴ 10

0

Entering edit mode

So I was able to reproduce your error using the minfiData on Bioconductor. The "Basename" column in your samplesheet should contain the full/relative path to the IDAT files, i.e. "/home/poojitha/idats/5684819001_R01C01".

The easiest way out IMO is to use the function read.metharray.sheet([base_directory]). You will need to change some columns in your samplesheet to include a column labeled "Sentrix_ID" and a column labeled "Sentrix_Position", which are basically "chip.ID", "chip.C0.position" and "chip.C0.position" in your case.

Example:

Sentrix_ID	Sentrix_Position
5684819001	R01C01
5684819001	R02C01

Once you have that done, minfi should be able to read everything in one go (regardless of whether all idats are in sub-directories).

baseDir = [folder containing the samplesheet and the idat files]

samplesheet_blood <- read.metharray.sheet(baseDir)

RGset_blood <- read.metharray.exp(targets = samplesheet_blood) #Take careful note that you call the "targets" argument and not the default "base"

ADD REPLY • link 6.5 years ago Andy91 ▴ 60

0

Entering edit mode

Suggestion on how to make all of this more robust would be great; happy to make more checks. Perhaps file an issue on GitHub. On Wed, Oct 25, 2017 at 7:20 AM, Andy91 [bioc] <noreply@bioconductor.org> wrote: > Activity on a post you are following on support.bioconductor.org > > User Andy91 <https: support.bioconductor.org="" u="" 8905=""/> wrote Comment: > Error using read.metharray.exp in minfi to read idat files > <https: support.bioconductor.org="" p="" 102032="" #102121="">: > > So I was able to reproduce your error using the minfiData on Bioconductor. > The "Basename" column in your samplesheet should contain the full/relative > path to the IDAT files, i.e. "/home/poojitha/idats/5684819001_R01C01". > > The easiest way out IMO is to use the function read.metharray.sheet([base_directory]). > You will need to change some columns in your samplesheet to include a > column labeled "Sentrix_ID" and "Sentrix_Position", which are basically > "chip.ID", "chip.C0.position" and "chip.C0.position" in your case: > Sentrix_ID Sentrix_Position > 5684819001 R01C01 > 5684819001 R02C01 > > Once you have that done, minfi should be able to read everything in one go > (regardless of whether all idats are in sub-directories). > > baseDir = [folder containing the samplesheet and the idat files] > > samplesheet_blood <- read.metharray.sheet(baseDir) > > RGset_blood <- read.metharray.exp(targets = targets) #Take careful note that you call the "targets" argument and not the default "base" > > > > ------------------------------ > > Post tags: minfi, read.metharray.exp, idat, HM450k > > You may reply via email or visit https://support.bioconductor. > org/p/102032/#102121 >

ADD REPLY • link 6.5 years ago Kasper Daniel Hansen ★ 6.5k

0

Entering edit mode

Hi Kasper,

I think at this point there is not much extra you could do, in the manual the entry for read.meth.exp states that the base argument represents the base directory, at most you could perhaps make a note that this represents the full/relative from the working directory path to the directory containing all the idat files.

Alternatively, an extra input sanitation step could be implemented that utilizes regex to parse the Sentrix ID and Positions from the input and assumes the idat files are in the working directory, but I suspect that could make things more confusing to begin with.

ADD REPLY • link 6.5 years ago Andy91 ▴ 60

0

Entering edit mode

Thank you Andy and Kasper. I tried renaming the Sentrix ID and position columns and added full path to Basename column. It seems to work now. #Problemresolved

ADD REPLY • link 6.5 years ago poojitha.stemcell ▴ 10