Question

Clarification on counting in Rsubread (featureCounts)

0

Entering edit mode

Luca • 0

@lucapiacentini-9597

Last seen 20 days ago

Italy

Dear Subread developers,

I am analyzing RNA sequencing data from ribo-depleted RNA samples generated using 150 bp paired-end, stranded sequencing. Reads were aligned to the human reference genome GRCh38 (Ensembl release 115) using STAR, with more than 95 percent of reads successfully aligned. I then quantified gene expression using featureCounts with the corresponding Ensembl GTF (release 115).

When running featureCounts with -t exon -g gene_id, approximately 20 to 30 percent of the aligned reads are assigned, which is expected given that this setting effectively quantifies mature (exonic) RNA only. In contrast, when using -t gene -g gene_id, the proportion of assigned reads increases to about 70 to 85 percent, consistent with aggregation across the full gene body, including intronic and other non exonic regions present in ribo-depleted libraries.

However, I observe an unexpected behavior: for some genes that have non-zero counts across all samples when using -t exon, the corresponding counts are zero when using -t gene. Intuitively, I would expect these genes to retain at least the same counts (or even higher counts) when switching from exon-level to gene-level features, not to drop to zero.

Is there a plausible explanation for this behavior?

Thanks in advance for your help.

Best regards.

featureCounts Rsubread • 173 views

ADD COMMENT • link updated 3 hours ago by Gordon Smyth 53k • written 28 days ago by Luca • 0

0

Entering edit mode

Have you told featureCounts to do strand-specific counting?

As pointed out by Frances Turner, you can lose reads if they overlap the gene bodies of more than one gene. Such overlaps are increased if you consider full gene bodies, but are greatly reduced if strand is taken into account. Most overlapping genes are on opposite strands.

ADD REPLY • link 3 hours ago Gordon Smyth 53k

score 1 · Answer 1 · 2026-01-12

1

Entering edit mode

Frances Turner ▴ 10

@6bce1540

Last seen 12 hours ago

United Kingdom

You should look at the summary files produced by feature counts. This gives a breakdown of what happened to reads that were not assigned. You may find the number of read in the 'ambigious' category has increased . This could happen if the introns (but not exons) of the genes in question overlap with another gene

ADD COMMENT • link 12 hours ago Frances Turner ▴ 10