Hi,
I have a question regarding the quantification summary file that featureCounts produces. My paired-end RNAseq data includes some read pairs of which only one mate is mapped, while the other one is unmapped. I'm generally counting read pairs (instead of individual reads). However, for pairs with only one mate mapped, both reads are included separately in the featureCounts summary. As an example, I used the following input read pair:
SRR22319547.8346125 89 chr1 967405 255 100M * 0 0 GGTCTTCACAGGGTAGATCCCAGCCCCTTTCAGATGTGTCTGGTGCTGGGATGAGGGAACAGGACCAGGAACCTGGGCTTCAGGGCAGACAGGAACCCCC FFFF:FFFFFFFF,:F::FFF:FFFFFF::FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF NH:i:1 HI:i:1 AS:i:98 nM:i:0 MD:Z:100 NM:i:0
SRR22319547.8346125 165 * 0 0 * chr1 967405 0 CACCCCCCATCCCCCCCCCCCCCCCCCCCC FFFFFFF,,,,FF:,FFFFFFFF::FFFFF NH:i:0 HI:i:0 AS:i:98 nM:i:0 uT:A:4
and ran featureCounts v2.0.6 with the following parameters:
featureCounts -a gencode.v31.primary_assembly.annotation.gtf -t transcript -p -s2 --countReadPairs -o SAMPLE.readcounts.tsv SAMPLE.bam
The resulting summary file reports:
Status SAMPLE.bam
Assigned 1
Unassigned_Unmapped 0
Unassigned_Read_Type 0
Unassigned_Singleton 0
Unassigned_MappingQuality 0
Unassigned_Chimera 0
Unassigned_FragmentLength 0
Unassigned_Duplicate 0
Unassigned_MultiMapping 0
Unassigned_Secondary 0
Unassigned_NonSplit 0
Unassigned_NoFeatures 1
Unassigned_Overlapping_Length 0
Unassigned_Ambiguity 0
i.e. the mates were not counted as a pair, but as individual reads. When summarizing all counts provided in the summary, the sum is higher than the initial number of pairs in the sample (here: 2 instead of 1). Is that an intended behavior and do you think it would be possible to indicate counts originating from individual reads separately? The way it is currently reported does not allow the user to distinguish assigned pairs from assigned single reads or unassigned no-feature pairs from unassigned no-feature single reads.
Thanks a lot and best,
Anke.
Thanks a lot for the fast reply and for pointing me to the -B argument. Setting -B gives a clear count of pairs indeed. However, I was wondering if -B is not set and singletons are included in the counting, if there was still a way to see them as separate category/ies in the summary (e.g. a category Unassigned_Singleton_Unmapped counting unmapped mates instead of adding them to category Unassigned_NoFeatures)?