Question

Clarification on counting in Rsubread (featureCounts)

0

Entering edit mode

Luca • 0

@lucapiacentini-9597

Last seen 17 days ago

Italy

Dear Subread developers,

I am analyzing RNA sequencing data from ribo-depleted RNA samples generated using 150 bp paired-end, stranded sequencing. Reads were aligned to the human reference genome GRCh38 (Ensembl release 115) using STAR, with more than 95 percent of reads successfully aligned. I then quantified gene expression using featureCounts with the corresponding Ensembl GTF (release 115).

When running featureCounts with -t exon -g gene_id, approximately 20 to 30 percent of the aligned reads are assigned, which is expected given that this setting effectively quantifies mature (exonic) RNA only. In contrast, when using -t gene -g gene_id, the proportion of assigned reads increases to about 70 to 85 percent, consistent with aggregation across the full gene body, including intronic and other non exonic regions present in ribo-depleted libraries.

However, I observe an unexpected behavior: for some genes that have non-zero counts across all samples when using -t exon, the corresponding counts are zero when using -t gene. Intuitively, I would expect these genes to retain at least the same counts (or even higher counts) when switching from exon-level to gene-level features, not to drop to zero.

Is there a plausible explanation for this behavior?

Thanks in advance for your help.

Best regards.

featureCounts Rsubread • 132 views

ADD COMMENT • link updated 7 days ago by Kevin Blighe ★ 4.0k • written 25 days ago by Luca • 0