Off topic:Low number of assigned reads using featureCounts / Strandedness issue
1
0
Entering edit mode
@martinweihrauch-13664
Last seen 6.0 years ago

I've conducted a RNA-Seq experiment generating fastq-files. The sequencing libraries were prepared with a Illumina TruSeq stranded mRNA kit using dUTPs for strandedness and producing single-end reads.

I've mapped the fastq-files to a genome (using STAR and the latest GENCODE release GRCm38.p6 genome fasta files, and the annotation file "Comprehensive gene annotation" M17 CHR).

Now I'm doing the read-counting in R using featureCounts() from Rsubread package (all latest versions; bioconductor 3.6).

 

When I count the reads with strandSpecific = 1 inside featureCounts() I only get about 5% of the reads successfully aligned, compared to 89.2% if I set it to count unstranded (strandSpecific = 0) and 92.6% if I set strandedness as "reverse" (strandSpecific = 2).

I was told that strandSpecific = 2 ("reverse") should be used for stranded paired-end reads only... and I got single-end reads here.

Here's my code:

# gtf.file <- file.path("~/Annotation_files/GENCODE/gencode.vM17.comprehensive.annotation.gtf")

fc <- featureCounts(bam.files.sorted[1], annot.ext = gtf.file,  isGTFAnnotationFile = TRUE, strandSpecific = 1, isPairedEnd = FALSE)

  Single-end reads are included.                                          ||
||    Assign reads to features...                                             ||
||    Total reads : 35684655                                                  ||
||    Successfully assigned reads : 1780810 (5.0%)                            ||
||    Running time : 1.39 minutes

 

 

However, if I set strandSpecific to = 2, I get:

 

Single-end reads are included.                                          ||
||    Assign reads to features...                                             ||
||    Total reads : 35684655                                                  ||
||    Successfully assigned reads : 33041494 (92.6%)                          ||
||    Running time : 1.42 minutes  

 

What might be going on here? I'm very grateful for any ideas or solutions to this problem.

 

Edit: I´ve tried and took a look at my .bam files in IGV:

https://picload.org/view/doragdpa/capture2.jpg.html

 

It looks like the reads always go in the opposite direction of the gene, suggesting reverse-strandedness. 

With this I can be sure that I have to count the reads as "reverse".

rsubread featurecounts strandedness rnaseq • 1.7k views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 754 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6