Search
Question: Error using read.metharray.exp in minfi to read idat files
0
gravatar for poojitha.stemcell
4 weeks ago by
poojitha.stemcell0 wrote:

Hi,

I have been using minfi to process Illumina 450K. I have never faced any issues with reading in IDAT files using read.metharray.exp() function with a valid samplesheet input that includes a column 'Basename' containing the barcodes of IDAT files. However, when I tried to do the same today, I am repeatedly getting this error message that states none of the green idats are recognisable (shown below). I have carefully checked the format of my samplesheet and the idat files in my directory. I restarted R and freshly installed minfi. Still no luck. This function worked for me even yesterday. I am not sure why I am facing this problem now. Does anyone else have the same problem or can think of a solution to resolve this problem? 

Thank you.

Regards,

Poojitha

> samplesheet_blood<-read.csv("samplesheet_blood.csv",header=TRUE,sep=",")

> RGSet_blood<-read.metharray.exp(targets = samplesheet_blood, extended = extended)
Error in read.metharray(files, extended = extended, verbose = verbose,  : 
  The following specified files do not exist:5684819001_RO1CO1_Grn.idat, 5684819001_RO2CO1_Grn.idat, 5684819001_RO3CO1_Grn.idat, 5727920027_RO1CO1_Grn.idat, 5727920027_RO2CO1_Grn.idat, 5727920027_RO3CO1_Grn.idat, 5684819001_RO4CO1_Grn.idat, 5684819001_RO5CO1_Grn.idat, 5684819001_RO6CO1_Grn.idat, 5727920027_RO4CO1_Grn.idat, 5727920027_RO5CO1_Grn.idat, 5727920027_RO6CO1_Grn.idat, 5684819001_RO1CO2_Grn.idat, 5684819001_RO2CO2_Grn.idat, 5684819001_RO3CO2_Grn.idat, 5727920027_RO1CO2_Grn.idat, 5727920027_RO2CO2_Grn.idat, 5727920027_RO3CO2_Grn.idat, 5684819001_RO4CO2_Grn.idat, 5684819001_RO5CO2_Grn.idat, 5684819001_RO6CO2_Grn.idat, 5684819004_RO1CO1_Grn.idat, 5684819004_RO2CO1_Grn.idat, 5684819004_RO3CO1_Grn.idat, 5684819004_RO4CO1_Grn.idat, 5684819004_RO5CO1_Grn.idat, 5684819004_RO6CO1_Grn.idat, 5727920033_RO4CO1_Grn.idat, 5727920033_RO5CO1_Grn.idat, 5727920033_RO6CO1_Grn.idat, 5684819004_RO1CO2_Grn.idat, 5684819004_RO2CO2_Grn.idat, 5684819004_RO3CO2_Grn.idat, 5727920033_RO1CO2_Grn.

 

ADD COMMENTlink modified 4 weeks ago • written 4 weeks ago by poojitha.stemcell0
0
gravatar for Andy91
4 weeks ago by
Andy9130
Netherlands
Andy9130 wrote:

One oddity I observe in your output is that the zeros of the plate position have been replaced with an "O". This does not appear to be the case for the plate IDs.

ADD COMMENTlink modified 4 weeks ago • written 4 weeks ago by Andy9130

I get the same error even after correcting it all to zeros. 

ADD REPLYlink written 4 weeks ago by poojitha.stemcell0

Could you show the directory structure of the location where all your .idat files are located? Are they located in subfolders, or are they all located in the same folder? How does your samplesheet look like?

ADD REPLYlink written 4 weeks ago by Andy9130
0
gravatar for poojitha.stemcell
4 weeks ago by
poojitha.stemcell0 wrote:

My Samplesheet looks like this:

> head(samplesheet_blood)
  SampleName   cell_type    chip.ID chip.C0.position chip.Row.Pos          Basename
1     WB 105 Whole blood 5684819001              C01          R01 5684819001_R01C01
2     WB 218 Whole blood 5684819001              C01          R02 5684819001_R02C01
3     WB 261 Whole blood 5684819001              C01          R03 5684819001_R03C01
4     WB 043 Whole blood 5727920027              C01          R01 5727920027_R01C01
5     WB 160 Whole blood 5727920027              C01          R02 5727920027_R02C01
6     WB 149 Whole blood 5727920027              C01          R03 5727920027_R03C01

 

My IDAT files are in the directory and these are the only files in my directory. 

ADD COMMENTlink written 4 weeks ago by poojitha.stemcell0

So I was able to reproduce your error using the minfiData on Bioconductor. The "Basename" column in your samplesheet should contain the full/relative path to the IDAT files, i.e. "/home/poojitha/idats/5684819001_R01C01".

The easiest way out IMO is to use the function read.metharray.sheet([base_directory]). You will need to change some columns in your samplesheet to include a column labeled "Sentrix_ID" and a column labeled "Sentrix_Position", which are basically "chip.ID", "chip.C0.position" and "chip.C0.position" in your case.

Example:

Sentrix_ID Sentrix_Position
5684819001               R01C01
5684819001               R02C01

Once you have that done, minfi should be able to read everything in one go (regardless of whether all idats are in sub-directories).

baseDir = [folder containing the samplesheet and the idat files]

samplesheet_blood <- read.metharray.sheet(baseDir)

RGset_blood <- read.metharray.exp(targets = samplesheet_blood) #Take careful note that you call the "targets" argument and not the default "base"
ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by Andy9130
Suggestion on how to make all of this more robust would be great; happy to make more checks. Perhaps file an issue on GitHub. On Wed, Oct 25, 2017 at 7:20 AM, Andy91 [bioc] <noreply@bioconductor.org> wrote: > Activity on a post you are following on support.bioconductor.org > > User Andy91 <https: support.bioconductor.org="" u="" 8905=""/> wrote Comment: > Error using read.metharray.exp in minfi to read idat files > <https: support.bioconductor.org="" p="" 102032="" #102121="">: > > So I was able to reproduce your error using the minfiData on Bioconductor. > The "Basename" column in your samplesheet should contain the full/relative > path to the IDAT files, i.e. "/home/poojitha/idats/5684819001_R01C01". > > The easiest way out IMO is to use the function read.metharray.sheet([base_directory]). > You will need to change some columns in your samplesheet to include a > column labeled "Sentrix_ID" and "Sentrix_Position", which are basically > "chip.ID", "chip.C0.position" and "chip.C0.position" in your case: > Sentrix_ID Sentrix_Position > 5684819001 R01C01 > 5684819001 R02C01 > > Once you have that done, minfi should be able to read everything in one go > (regardless of whether all idats are in sub-directories). > > baseDir = [folder containing the samplesheet and the idat files] > > samplesheet_blood <- read.metharray.sheet(baseDir) > > RGset_blood <- read.metharray.exp(targets = targets) #Take careful note that you call the "targets" argument and not the default "base" > > > > ------------------------------ > > Post tags: minfi, read.metharray.exp, idat, HM450k > > You may reply via email or visit https://support.bioconductor. > org/p/102032/#102121 >
ADD REPLYlink written 4 weeks ago by Kasper Daniel Hansen6.3k

Hi Kasper,

I think at this point there is not much extra you could do, in the manual the entry for read.meth.exp states that the base argument represents the base directory, at most you could perhaps make a note that this represents the full/relative from the working directory path to the directory containing all the idat files.

Alternatively, an extra input sanitation step could be implemented that utilizes regex to parse the Sentrix ID and Positions from the input and assumes the idat files are in the working directory, but I suspect that could make things more confusing to begin with.

ADD REPLYlink written 4 weeks ago by Andy9130

Thank you Andy and Kasper. I tried renaming the Sentrix ID and position columns and added full path to Basename column. It seems to work now. #Problemresolved

ADD REPLYlink written 26 days ago by poojitha.stemcell0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 127 users visited in the last hour