I have rna-seq data (18 samples) from which I produced the raw counts using htseq-count. Now I want to analyze the data using edgeR. I've done this dozens of times and had no issues. Not this time. When I execute
samples <- read.csv(file="metadata_composite.csv",header=TRUE,sep=",") counts <- readDGE(samples$CountFiles)$counts
I get:
Error in taglist[[i]] : subscript out of bounds
I've analyzed the same data using Deseq2 and had no issues whatsoever. So it is something related to edgeR. Here is my data - it consists of 18 directories each containing a count file (accepted_hits.count) and a directory called edgeR which contains "metadata_composite.csv" which is the metadata file that I load in in the code snippet above.
Can you tell what is the problem?
The same error I got as Nick N.
> s=read.csv("Samples.csv")
> counts=readDGE(s$cf)$counts
Error in taglist[[i]] : subscript out of bounds
I can't able to open his link that he provided and I couldn't able to find error in my csv file. can u pls point out what are changes to make for rectify the error.
The above meta table information which I am using to run the commands.
Well, for starters, there's no
cf
column in your metadata table.I have used paste function for adding cf column to Samples.csv file. In terminal window I could able to find the column cf with its values but I could not able to find the cf column in csv file when I opened directly. I manually entered the cf column in csv fiile, but still different error occurred like
counts=readDGE(s$cf)$counts
Error in read.table(file = file, header = header, sep = sep, quote = quote, :
no lines available in input.
I have one doubt , without variable name we can't read the csv file in R, am I right?... Why I am asking means the protocol which I am using to follow ,they directly called csv file into R and manipulating the columns.But I could not able to do that.
Any suggestion to rectify the error
Error in read.table(file = file, header = header, sep = sep, quote = quote, :
no lines available in input.
It's not clear to me what you're actually doing. All I can say is to check that:
cf
in your data frames
.s$cf
is a character vector that contains paths to count files.I would also suggest that you find someone local to help you, since it seems like you're new to this. This support site is not meant to be a place to learn R or Bioconductor.