Question

Error when counting reads in genes with summarizeOverlaps (Genomic-Aligments package)

0

Entering edit mode

alejandro.colaneri ▴ 20

@alejandrocolaneri-7051

Last seen 9.5 years ago

United States

Hello,
I'm following the RNA-seq workflow for differential gene expression
white paper by Michael Love Simon Anders and Wolfgang Huber.http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=10&ved=0CFcQFjAJ&url=http%3A%2F%2Fwww.bioconductor.org%2Fhelp%2Fcourse-materials%2F2014%2FBioC2014%2FRNA-Seq-Analysis-Lab.pdf&ei=_ihmVK3nMoSrNsb9gsAH&usg=AFQjCNH5FkLy2MQwoJCUSMdfb3KrrP45Yw&sig2=iUkasIDbf7SyjBSOl0BKHQ&bvm=bv.79142246,d.eXY&cad=rja

## Use the function summarizeOverlaps to count reads in the gene
library("GenomicAlignments")
se <- summarizeOverlaps(exonsByGene, BamFileList(bamFiles), mode="Union", singleEnd=TRUE, ignore.strand=FALSE, fragments=FALSE);

however I got this error and I have not idea how to fix it:

Error in .summarizeOverlaps_BamFileList(features, reads, mode, ignore.strand = ignore.strand, :
duplicate 'names(reads)' not allowed

Can someone help please!!

all the steps I did before try to create the object "se" are below

### read the table: sampleTable.csv

sampleTable <- read.csv("sampleTable.csv", header=TRUE);

### build the full path to the tophat produced bam files

bamFiles <- file.path(".", sampleTable$dirName, sampleTable$fileName);

### see the created vector with paths

bamFiles

##### Use the BamFile function from the RsamTools to se if these paths are functional

library ("Rsamtools");
seqinfo(BamFile(bamFiles[1]));

#Counting reads in genes

library("GenomicFeatures");

hse <-makeTranscriptDbFromGFF("/proj/seq/data/TAIR10_Ensembl/Annotation/Genes/genes.gtf", format="gtf")
exonsByGene <- exonsBy(hse, by="gene");

## Use the function summarizeOverlaps to count reads in the gene
library("GenomicAlignments")
se <- summarizeOverlaps(exonsByGene, BamFileList(bamFiles), mode="Union", singleEnd=TRUE, ignore.strand=FALSE, fragments=FALSE);

rnaseq summarizeOverlaps deseq2 tophat genomicalignments • 2.4k views

ADD COMMENT • link updated 9.5 years ago by Dan Tenenbaum ★ 8.2k • written 9.5 years ago by alejandro.colaneri ▴ 20

0

Entering edit mode

I believe that error is complaining that you have at least two files in your BamFileList with the same name. Is that the case?

ADD REPLY • link 9.5 years ago James W. MacDonald 65k

0

Entering edit mode

Actually when I built the list of path to my files I did not care about that. But the answer is YES, all the bam files in my bam file list have the same name, the original accepted_hits.bam name provided by tophat. Do you think this could be the source of the problem?

ADD REPLY • link 9.5 years ago alejandro.colaneri ▴ 20

0

Entering edit mode

I think you can provide distinct names for your bamFiles, e.g,

bamFiles <- file.path(".", sampleTable$dirName, sampleTable$fileNames
names(bamFiles) <- basename(dirname(bamFiles))

Or something more manual and the distinct names will carry forward.

ADD REPLY • link 9.5 years ago Martin Morgan 25k