Hi all - Hoping I can get some guidance on an issue I'm having with the featureCounts function in the RSubread package.
I've successfully aligned 150bp paired-end reads to the mouse genome and am trying to summarize the reads to features.
My problem is that if I summarize with strandSpecific=1, I get the following output showing less than 3% of fragments assigned to features:
|| Level : meta-feature level ||
|| Paired-end : yes ||
|| Strand specific : yes ||
|| Multimapping reads : not counted ||
|| Multi-overlapping reads : not counted ||
|| ||
|| Chimeric reads : counted ||
|| Both ends mapped : required ||
|| ||
\\===================== http://subread.sourceforge.net/ ======================//
//================================= Running ==================================\\
|| ||
|| Load annotation file /Users/CHMI/Desktop/mergedFastq/Mus_musculus.GRCm ... ||
|| Features : 665154 ||
|| Meta-features : 43629 ||
|| Chromosomes : 61 ||
|| ||
|| Process BAM file alignmentResultsPE_sample1.BAM... ||
|| Paired-end reads are included. ||
|| Assign fragments (read pairs) to features... ||
|| Total fragments : 28405354 ||
|| Successfully assigned fragments : 780157 (2.7%) ||
|| Running time : 1.42 minutes ||
In contrast, if I set strandSpecific=0 (unstranded), then this percentage goes up to >50% -- output below. This is closer to what I expect, given that these RNAseq libraries were prepared from total transcriptome).
|| ||
|| Level : meta-feature level ||
|| Paired-end : yes ||
|| Strand specific : no ||
|| Multimapping reads : not counted ||
|| Multi-overlapping reads : not counted ||
|| ||
|| Chimeric reads : counted ||
|| Both ends mapped : required ||
|| ||
\\===================== http://subread.sourceforge.net/ ======================//
//================================= Running ==================================\\
|| ||
|| Load annotation file Mus_musculus.GRCm38.79.gtf ... ||
|| Features : 665154 ||
|| Meta-features : 43629 ||
|| Chromosomes : 61 ||
|| ||
|| Process BAM file alignmentResultsPE_sample1.BAM... ||
|| Paired-end reads are included. ||
|| Assign fragments (read pairs) to features... ||
|| Total fragments : 28405354 ||
|| Successfully assigned fragments : 16337981 (57.5%) ||
|| Running time : 0.96 minutes ||
Hey Ryan - I was wondering how did you tell from the information provided by beiting that the dataset was probably strand-specific with reserver strandedness? I have gone through a very similar situation and after landing in this post, I checked that the featureCounts output for unstranded and reverse strand specific were almost identical (featureCounts v1.4.6-p6). I was just curious to know if there is anything in the featureCount logs combined with a more in-depth knowledge of the biology behind sequencing that allowed you to tell. Thanks!
Like I said, I always process every sample that I analyze using all 3 strandedness modes, so know what a typical good-quality sample looks like in all 3 counting modes. beiting's output closely matches what I've seen in the past for wrong-strand counting and unstranded counting.