Question: Error when counting reads in genes with summarizeOverlaps (Genomic-Aligments package)
0
gravatar for alejandro.colaneri
5.1 years ago by
United States
alejandro.colaneri20 wrote:

Hello,
I'm following the RNA-seq workflow for differential gene expression
white paper by Michael Love Simon Anders and Wolfgang Huber.http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=10&ved=0CFcQFjAJ&url=http%3A%2F%2Fwww.bioconductor.org%2Fhelp%2Fcourse-materials%2F2014%2FBioC2014%2FRNA-Seq-Analysis-Lab.pdf&ei=_ihmVK3nMoSrNsb9gsAH&usg=AFQjCNH5FkLy2MQwoJCUSMdfb3KrrP45Yw&sig2=iUkasIDbf7SyjBSOl0BKHQ&bvm=bv.79142246,d.eXY&cad=rja

 

## Use the function summarizeOverlaps to count reads in the gene
library("GenomicAlignments")
se <- summarizeOverlaps(exonsByGene, BamFileList(bamFiles), mode="Union", singleEnd=TRUE, ignore.strand=FALSE, fragments=FALSE);

 

however I got this error and I have not idea how to fix it:

Error in .summarizeOverlaps_BamFileList(features, reads, mode, ignore.strand = ignore.strand, :
duplicate 'names(reads)' not allowed

Can someone help please!!

all the steps I did before try to create the object "se" are below

### read the table: sampleTable.csv

sampleTable <- read.csv("sampleTable.csv", header=TRUE);

### build the full path to the tophat produced bam files

bamFiles <- file.path(".", sampleTable$dirName, sampleTable$fileName);

### see the created vector with paths

bamFiles

##### Use the BamFile function from the RsamTools to se if these paths are functional

library ("Rsamtools");
seqinfo(BamFile(bamFiles[1]));

#Counting reads in genes

library("GenomicFeatures");

hse <-makeTranscriptDbFromGFF("/proj/seq/data/TAIR10_Ensembl/Annotation/Genes/genes.gtf", format="gtf")
exonsByGene <- exonsBy(hse, by="gene");

## Use the function summarizeOverlaps to count reads in the gene
library("GenomicAlignments")
se <- summarizeOverlaps(exonsByGene, BamFileList(bamFiles), mode="Union", singleEnd=TRUE, ignore.strand=FALSE, fragments=FALSE);

 

 

ADD COMMENTlink modified 5.1 years ago by Dan Tenenbaum8.2k • written 5.1 years ago by alejandro.colaneri20

I believe that error is complaining that you have at least two files in your BamFileList with the same name. Is that the case?

ADD REPLYlink written 5.1 years ago by James W. MacDonald52k

Actually when I built the list of path to my files I did not care about that. But the answer is YES, all the bam files in my bam file list have the same name, the original accepted_hits.bam name provided by tophat. Do you think this could be the source of the problem?

ADD REPLYlink written 5.1 years ago by alejandro.colaneri20

I think you can provide distinct names for your bamFiles, e.g,

bamFiles <- file.path(".", sampleTable$dirName, sampleTable$fileNames
names(bamFiles) <- basename(dirname(bamFiles))

Or something more manual and the distinct names will carry forward.

ADD REPLYlink written 5.1 years ago by Martin Morgan ♦♦ 24k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 368 users visited in the last hour