Question: Minfi error in reading Basename column
0
gravatar for parap
3.9 years ago by
parap0
United States
parap0 wrote:

Hello all, 

I am new to  Methylation data analysis and having problem in importing data in minfi package- 

After  targets <-read.450k.sheet(baseDir, pattern = "csv$"),  it reads the data sample sheet which has basic Sample_name, Slide etc, I get error in Basename column name ( which is not there in my sample sheet). 

The sample sheets of my other datasets did not have any specific "Basename" column and run fine, but while importing this data it reads Basename column with missing path - 

                                                Basename  Array       Slide
1 E:/10003885002/10003885002_R01C01 R01C01 10003885002
2                                            character(0) R02C01 10003885002

Due to these character(0) the RgSet also gives error : 

The following specified files do not exist:character(0)_Grn.idat

Can anyone please tell why I am getting character(0) under Basename?

can I manually add a Basename column with all paths in my sample sheet?

Do I need to add a Basename column in sample sheet, with path information for each sample? I am not able to find any information anywhere. 

please help!

 

minfi methylation • 4.3k views
ADD COMMENTlink modified 2.8 years ago by SplittingInfinity50 • written 3.9 years ago by parap0
Answer: Minfi error in reading Basename column
1
gravatar for James W. MacDonald
3.9 years ago by
United States
James W. MacDonald50k wrote:

When you post a question, please use the 'Question' type, rather than 'Tutorial'. As the name might suggest, a Tutorial post is intended to provide a tutorial, rather than ask a question.

The Basename column is generated programmatically, by looking at information in your SampleSheet.csv and then inferring the file name for the corresponding Grn.idat file. In your case, the expectation is that there will be a file

E:/10003885002/10003885002_R02C01_Grn.idat

and when it isn't found, you get a character(0) returned. So you are missing at least one idat file, so you need to figure out why you are missing raw data files.

ADD COMMENTlink written 3.9 years ago by James W. MacDonald50k
1

Just to add my 2-bit of info in case someone come across this error.  I kept getting similar error and when I looked I realized that the filenames were incorrectly rendered because originally I used excel to make my sample sheet.  Excel treats the barcode like numbers and thus automatically sets it to scientific however this is a barcode and not number, so make make sure to change it to number with no decimal! works perfectly now after I saved it to csv.  

ADD REPLYlink written 2.3 years ago by Ahdee40

Thanks for the prompt response James! 

Sure, I will chose right category while posting next time, thanks for correcting. 

Thanks for the pointing the error, yes I checked the data and seems I received incomplete dataset, so wrong Basename column was generated. It removed few samples and it runs fine now. 

 

 

ADD REPLYlink written 3.9 years ago by parap0

Hi James,

I am also encountering the same problem but all the IDAT files are available and for some reason it is not recognizing the pair IDAT file. Any thoughts?

thanks!

Cristina

ADD REPLYlink written 3.6 years ago by cristinalanata0
Answer: Minfi error in reading Basename column
0
gravatar for ankita.chatterjee88
3.7 years ago by
United States
ankita.chatterjee880 wrote:

Thanks James for your reply...I am also stuck at this point. I ran 4 chips in 2sets...when I am trying to read the chip data for the first set...everything is going fine...but when I tried to read data from all the four chips...the basename is showing "Character(0)".

As you mentioned I individually checked for all the idat files and not a single one was missing. At this point please help me how should I proceed? Shall I prepare a target file in .txt format and read it in R?

ADD COMMENTlink written 3.7 years ago by ankita.chatterjee880

Hi ankita - Running into the same error, and pretty sure I have all IDAT files. Where you able to solve this by bypassing excel?

 

thanks! 

ADD REPLYlink written 3.6 years ago by cristinalanata0

I resolved this issue by rechecking that the idat files match with the sample sheet as the basename column is populated automatically based on the idat files and sample sheet. I would suggest re-creating another sample sheet and test this by reading in smaller sample set. That worked for me eventually. 

ADD REPLYlink modified 3.6 years ago • written 3.6 years ago by parap0
Answer: Minfi error in reading Basename column
0
gravatar for Shicheng Guo
2.8 years ago by
United States
Shicheng Guo0 wrote:

I know the reason eventually, please see the following script which was used by ChAMP or some other package to read and creat SampleSheet.csv. 

 read.metharray.sheet()

function (base, pattern = "csv$", ignore.case = TRUE, recursive = TRUE,
    verbose = TRUE)
{
    readSheet <- function(file) {
        dataheader <- grep("^\\[DATA\\]", readLines(file), ignore.case = TRUE)
        if (length(dataheader) == 0)
            dataheader <- 0
        df <- read.csv(file, stringsAsFactor = FALSE, skip = dataheader)
        if (length(nam <- grep("Sentrix_Position", names(df),
            ignore.case = TRUE, value = TRUE)) == 1) {
            df$Array <- as.character(df[, nam])
            df[, nam] <- NULL
        }
        if (length(nam <- grep("Array[\\._]ID", names(df), ignore.case = TRUE,
            value = TRUE)) == 1) {
            df$Array <- as.character(df[, nam])
            df[, nam] <- NULL
        }
        if (!"Array" %in% names(df))
            warning(sprintf("Could not infer array name for file: %s",
                file))
        if (length(nam <- grep("Sentrix_ID", names(df), ignore.case = TRUE,
            value = TRUE)) == 1) {
            df$Slide <- as.character(df[, nam])
            df[, nam] <- NULL
        }
        if (length(nam <- grep("Slide[\\._]ID", names(df), ignore.case = TRUE,
            value = TRUE)) == 1) {
            df$Slide <- as.character(df[, nam])
            df[, nam] <- NULL
        }
        if (!"Slide" %in% names(df))
            warning(sprintf("Could not infer slide name for file: %s",
                file))
        else df[, "Slide"] <- as.character(df[, "Slide"])
        if (length(nam <- grep("Plate[\\._]ID", names(df), ignore.case = TRUE,
            value = TRUE)) == 1) {
            df$Plate <- as.character(df[, nam])
            df[, nam] <- NULL
        }
        for (nam in c("Pool_ID", "Sample_Plate", "Sample_Well")) {
            if (nam %in% names(df)) {
                df[[nam]] <- as.character(df[[nam]])
            }
        }
        if (!is.null(df$Array)) {
            patterns <- sprintf("%s_%s_Grn.idat", df$Slide, df$Array)
            allfiles <- list.files(dirname(file), recursive = recursive,
                full.names = TRUE)
            basenames <- sapply(patterns, function(xx) grep(xx,
                allfiles, value = TRUE))
            names(basenames) <- NULL
            basenames <- sub("_Grn\\.idat", "", basenames, ignore.case = TRUE)
            df$Basename <- basenames
        }
        df
    }
    if (!all(file.exists(base)))
        stop("'base' does not exists")
    info <- file.info(base)
    if (!all(info$isdir) && !all(!info$isdir))
        stop("'base needs to be either directories or files")
    if (all(info$isdir)) {
        csvfiles <- list.files(base, recursive = recursive, pattern = pattern,
            ignore.case = ignore.case, full.names = TRUE)
        if (verbose) {
            message("[read.metharray.sheet] Found the following CSV files:\n")
            print(csvfiles)
        }
    }
    else csvfiles <- list.files(base, full.names = TRUE)
    dfs <- lapply(csvfiles, readSheet)
    namesUnion <- Reduce(union, lapply(dfs, names))
    df <- do.call(rbind, lapply(dfs, function(df) {
        newnames <- setdiff(namesUnion, names(df))
        newdf <- matrix(NA, ncol = length(newnames), nrow = nrow(df),
            dimnames = list(NULL, newnames))
        cbind(df, as.data.frame(newdf))
    }))
    df
}

 

ADD COMMENTlink written 2.8 years ago by Shicheng Guo0
Answer: Minfi error in reading Basename column
0
gravatar for SplittingInfinity
2.8 years ago by
SplittingInfinity50 wrote:

read.450k() throws the character(0)_Grn.idat error because it couldn't find the file specified in the spreadsheet.

One of the most common reason is due to sample sheet format. Look for trailing space or illegal characters in your csv file.

 

ADD COMMENTlink written 2.8 years ago by SplittingInfinity50

I am not sure how people solved the character(0) problem for the basename, but I am stuck on it for sometime now. I checked both the csv file and saw if all the IDAT's were present, I think there is no problem with these two. Could you all please guide me to fix this? Snippet of my code: 

      

library(minfi)

baseDir <-"/home/idats"

targets=read.metharray.sheet(baseDir)

print(targets)

Output with last few columns:

 

   sex status        Array  Slide     Basename

1    M   Normal 7420085 R06C02 character(0)

2    M   Cancer 7420085 R06C02 character(0)

3    M    Normal 7420117 R06C02 character(0)

4    M    Cancer 7420117 R02C02 character(0)

 

 

 

 

 

ADD REPLYlink written 2.7 years ago by hrishi27n20

I moved the CSV file into the same folder as the IDAT files and made that my working directory and badabing!  It worked.  So just move the CSV file.

Also, I made my CSV file in excel and saved as a CSV file.  If you open in a text editor you can hit return after the last item on the last row in the last column to add a carriage return and fix the "end of line" error.

 

ADD REPLYlink written 6 months ago by michelle.wedemeyer0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 181 users visited in the last hour