Hello,
I'm following the RNA-seq workflow for differential gene expression
white paper by Michael Love Simon Anders and Wolfgang Huber.http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=10&ved=0CFcQFjAJ&url=http%3A%2F%2Fwww.bioconductor.org%2Fhelp%2Fcourse-materials%2F2014%2FBioC2014%2FRNA-Seq-Analysis-Lab.pdf&ei=_ihmVK3nMoSrNsb9gsAH&usg=AFQjCNH5FkLy2MQwoJCUSMdfb3KrrP45Yw&sig2=iUkasIDbf7SyjBSOl0BKHQ&bvm=bv.79142246,d.eXY&cad=rja
## Use the function summarizeOverlaps to count reads in the gene library("GenomicAlignments") se <- summarizeOverlaps(exonsByGene, BamFileList(bamFiles), mode="Union", singleEnd=TRUE, ignore.strand=FALSE, fragments=FALSE);
however I got this error and I have not idea how to fix it:
Error in .summarizeOverlaps_BamFileList(features, reads, mode, ignore.strand = ignore.strand, :
duplicate 'names(reads)' not allowed
Can someone help please!!
all the steps I did before try to create the object "se" are below
### read the table: sampleTable.csv sampleTable <- read.csv("sampleTable.csv", header=TRUE); ### build the full path to the tophat produced bam files bamFiles <- file.path(".", sampleTable$dirName, sampleTable$fileName); ### see the created vector with paths bamFiles ##### Use the BamFile function from the RsamTools to se if these paths are functional library ("Rsamtools"); seqinfo(BamFile(bamFiles[1])); #Counting reads in genes library("GenomicFeatures"); hse <-makeTranscriptDbFromGFF("/proj/seq/data/TAIR10_Ensembl/Annotation/Genes/genes.gtf", format="gtf") exonsByGene <- exonsBy(hse, by="gene"); ## Use the function summarizeOverlaps to count reads in the gene library("GenomicAlignments") se <- summarizeOverlaps(exonsByGene, BamFileList(bamFiles), mode="Union", singleEnd=TRUE, ignore.strand=FALSE, fragments=FALSE);
I believe that error is complaining that you have at least two files in your BamFileList with the same name. Is that the case?
Actually when I built the list of path to my files I did not care about that. But the answer is YES, all the bam files in my bam file list have the same name, the original accepted_hits.bam name provided by tophat. Do you think this could be the source of the problem?
I think you can provide distinct names for your bamFiles, e.g,
Or something more manual and the distinct names will carry forward.