Question

Explainnation of featureCounts in-built annotation?

0

Entering edit mode

Jack • 0

@jack-14069

Last seen 4.4 years ago

Hi all,

I use featurecounts first to get the summary of the counts and the annotation, below are the codes:

test<-featureCounts(files="test.bam", annot.inbuilt="mm10", annot.ext=NULL, isGTFAnnotationFile=FALSE, GTF.featureType="exon", GTF.attrType="gene_id", chrAliases=NULL, useMetaFeatures=TRUE,  strandSpecific=1, isPairedEnd=TRUE, requireBothEndsMapped=FALSE, checkFragLength=FALSE, minFragLength=0,maxFragLength=600, countChimericFragments=TRUE, autosort=TRUE,verbose=FALSE)

write.table(x=data.frame(test$annotation[,c("GeneID","Chr","Start","End","Strand","Length")],test$counts,stringsAsFactors=FALSE),file="/home/test-counts.txt", quote=FALSE, sep="\t", row.names = FALSE)

The results are like this:

For the "Chr", "Start","End","Strand" annotation there are serveral values for each gene. Are they the location of different exons of the same gene?

How can I get the location of the gene(for each geneid there is only one value in the "Chr", "Start","End","Strand") not the exons?

Thank you for your help!

rnaseq annotation featurecounts • 2.5k views

ADD COMMENT • link updated 6.5 years ago by Wei Shi ★ 3.6k • written 6.5 years ago by Jack • 0

score 3 · Accepted Answer · 2017-10-29

Yes the annotations are for each exon, not for genes. You can work out the transcriptional start and end positions of genes by using the coordinates of exons belonging to the same gene and the strand information.

An easier way might be to use the getInBuiltAnnotation function in Rsubread. This function returns a data frame in which each row is an exon and columns are annotation data.