Question: The optimal minMQS parameter in featureCounts for RNA-Seq quantification
1
gravatar for Gary
3 months ago by
Gary10
Gary10 wrote:

I have asked this question on Biostars (https://www.biostars.org/p/361800/), and have not got any response. Many thanks for any suggestions and helps.

We have six RNA-Seq data, including three wild type and three knock-out mice. I use STAR 2.5.3a (built-in Partek Flow) for the alignment and featureCounts (Rsubread_1.30.9) for the quantification. The command I run featureCoutns is below.

files <- c("Chuong753.bam", "Chuong754.bam", "Chuong755.bam",
  "Chuong756.bam", "Chuong757.bam", "Chuong758.bam")
rc <- featureCounts(files, annot.ext = "mm10ncbiRefSeqCurated.gtf",
  isGTFAnnotationFile = TRUE, GTF.featureType = "exon",
  GTF.attrType = "gene_id", minMQS = 10, strandSpecific = 0,
  nthreads = 6, verbose = TRUE)

I found that there are a lot of Unassigned_MappingQuality reads (about 26.9% of total alignments, the detail below). Should I set minMQS=3 or 0 to increase the number of Assigned reads? Many thanks.

rc$stat
                          Status Chuong753 Chuong754 Chuong755 Chuong756 Chuong757 Chuong758
1                       Assigned  28552790  19795064  26274194  26601820  21264775  21703604
2            Unassigned_Unmapped         0         0         0         0         0         0
3      Unassigned_MappingQuality  10233734   6718392  10369784   9299763   9249801  12905108
4             Unassigned_Chimera         0         0         0         0         0         0
5      Unassigned_FragmentLength         0         0         0         0         0         0
6           Unassigned_Duplicate         0         0         0         0         0         0
7        Unassigned_MultiMapping         0         0         0         0         0         0
8           Unassigned_Secondary         0         0         0         0         0         0
9         Unassigned_Nonjunction         0         0         0         0         0         0
10         Unassigned_NoFeatures   2596367   1732967   2577948   1882419   2235492   2757615
11 Unassigned_Overlapping_Length         0         0         0         0         0         0
12          Unassigned_Ambiguity    237389    166711    228953    216951    178357    218899

In addition, I am confused about the mapping quality score. The range of mapping quality score seems:

  1. 0 to 255 for STAR (http://labshare.cshl.edu/shares/gingeraslab/www-data/dobin/STAR/STAR.posix/doc/STARmanual.pdf)

  2. 0 to 40 for Rsubread v1.32.0 (https://bioconductor.org/packages/release/bioc/vignettes/Rsubread/inst/doc/SubreadUsersGuide.pdf)

  3. 0 to read length (i.e. 75 in my case) for Rsubread v1.20.1 (http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.725.140&rep=rep1&type=pdf)

  4. 0 to 200 for Rsubread v1.11.10 (http://citeseerx.ist.psu.edu/viewdochelpnload?doi=10.1.1.366.8312&rep=rep1&type=pdf)

Could you hlep me? Thanks a lot.

ADD COMMENTlink modified 3 months ago by Gordon Smyth37k • written 3 months ago by Gary10
Answer: The optimal minMQS parameter in featureCounts for RNA-Seq quantification
2
gravatar for Wei Shi
3 months ago by
Wei Shi3.1k
Australia
Wei Shi3.1k wrote:

Please post your questions regarding Rsubread (featureCounts is part of it) to this Support Site as Rsubread is a Bioconductor package.

In general I don't recommend throwing away reads with low MAQ value when you count them to genes. This is because an RNA-seq read with low MAQ is likely to be correctly mapped if it hits an annotated gene (more precisely it hits an exon of a gene). The transcriptome only accounts for ~2% of the genome, therefore the chance that a read incorrectly mapped to the genome happens to map to the transcriptome is very low.

Regarding MAQ values reported by Rsubread, we have recently changed them to the range of 0 to 40 to make it be in line with the Phred score range.

ADD COMMENTlink written 3 months ago by Wei Shi3.1k

Hi Wei Shi, Thank you so much. Your suggestion is very helpful.

ADD REPLYlink written 3 months ago by Gary10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 272 users visited in the last hour