Question

Possible bug in Rsubread Phred-Score read

2

Entering edit mode

caufeminecraft ▴ 20

@caufeminecraft-7026

Last seen 10.3 years ago

United States

I have been using Rsubread to perform alignment for a large number of paired-end gzFASTQ files using the align() function, and have been careful to log all of the output to keep track of summary information and any error/warning messages that may have been produced. Of the alignments which reported a possibly incorrect Phred Offset (using a default of +33), only one was actually incorrect; the remaining alignments, which there were ~4-5 of, (involving ~8-10 fq.gz files), all reported a perceived range of [35,84], which is not a valid score range. I decided to zcat the fq.gz files and use AWK extract every 4th line to a text file, then wrote and ran a script to keep track of the running values for minimum and maximum observed ascii values. In all cases, the range was actually 35-74. My hypothesis (for which I have no evidence) is that the quality scores are being read from a line containing read information, as 'T' is ASCII 84 and is also the highest-ASCII-value nucleotide abbreviation. I have the quality scores for the paired-end reads for one sample, as well as the R console output from a successful reproduction of this possible bug. The former are 333MB each, so I cannot provide them in their entirety; however, I can provide the full output. The files in question were obtained from the PCGC data hub, so anyone with access to said data hub may find them there. I am including the two remote filepaths (the names are slightly different from the ones I have, but they are the same files) so that anyone with an account may retrieve them on their own with ExpeDat (https://pcgc.research.chop.edu/data_expedition), as long as they have a PCGC account. Here are the remote paths:

resrhdxp01.research.chop.edu:/pcgc/public/Other/transcriptome/fastq/PCGC0065312_HS_TX__1-02059__v1_FC427_L8_p3of6_P2.fastq.gz
resrhdxp01.research.chop.edu:/pcgc/public/Other/transcriptome/fastq/PCGC0065308_HS_TX__1-02059__v1_FC427_L8_p3of6_P1.fastq.gz

The command I ran to extract every fourth line is

zcat $1 | awk '!(NR%4)'

The script I used to calculate the minimum and maximum values was written in haskell, and the source is short enough that I will include it here:

import System.IO
import Control.Monad
import Data.Char
import Data.List

main = getLine >>= l.minmax

l t = do
    let (m,m') = t in putStrLn (show (ord m) ++ "-" ++ show (ord m'))
    foe <- isEOF
    when (not foe) (getLine >>= l.(mim t).minmax)

minmax (h:x) = foldl' mm (h,h) x
    where
        mm (a,b) c | c < a = (c,b) | c > b = (a,c) | otherwise = (a,b)

mim (a,b) (c,d) = (min a c, max b d)

If you save this to "phred.hs", then compile it with "ghc phred.hs", you can run

zcat $1 | awk '!(NR%4)' | ./phred | uniq

to view the running cumulative minimum and maximum ascii values. If any information I provided here is insufficiently descriptive or ambiguously worded, or if more information is necessary, I will be happy to provide whatever information I can to help resolve this bug.

Here is the output of my R session:

R version 3.1.1 (2014-07-10) -- "Sock it to Me"
Copyright (C) 2014 The R Foundation for Statistical Computing
Platform: x86_64-unknown-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> require(Rsubread)
Loading required package: Rsubread
> align(index = "genome",
+       readfile1 = "120521--FC427--L8--3of6--1-02059--RA--SeqInst3--R1.fq.gz",
+       readfile2 = "120521--FC427--L8--3of6--1-02059--RA--SeqInst3--R2.fq.gz",
+       input_format = "gzFASTQ",
+       output_format = "BAM",
+       output_file = "test.bam",
+       tieBreakHamming = TRUE,
+       unique = TRUE,
+       indels = 5,
+       nthreads = 1,
+       PE_orientation = "fr")

        ==========     _____ _    _ ____  _____  ______          _____
        =====         / ____| |  | |  _ \|  __ \|  ____|   /\   |  __ \
          =====      | (___ | |  | | |_) | |__) | |__     /  \  | |  | |
            ====      \___ \| |  | |  _ <|  _  /|  __|   / /\ \ | |  | |
              ====    ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
        ==========   |_____/ \____/|____/|_|  \_\______/_/    \_\_____/
       Rsubread 1.15.9

//========================== subread-align setting ===========================\\
||                                                                            ||
||           Function : Read alignment                                        ||
||            Threads : 1                                                     ||
||       Input file 1 : 120521--FC427--L8--3of6--1-02059--RA--SeqInst3--R ... ||
||       Input file 2 : 120521--FC427--L8--3of6--1-02059--RA--SeqInst3--R ... ||
||        Output file : test.bam (BAM)                                        ||
||         Index name : genome                                                ||
||       Phred offset : 33                                                    ||
||                                                                            ||
||    Min read1 votes : 3                                                     ||
||    Min read2 votes : 1                                                     ||
||  Max fragment size : 600                                                   ||
||  Min fragment size : 50                                                    ||
||                                                                            ||
||         Max indels : 5                                                     ||
||  # of Best mapping : 1                                                     ||
||     Unique mapping : yes                                                   ||
||   Hamming distance : yes                                                   ||
||     Quality scores : no                                                    ||
||                                                                            ||
\\===================== http://subread.sourceforge.net/ ======================//

//====================== Running (11-Nov-2014 14:44:53) ======================\\
||                                                                            ||
|| Decompress 120521--FC427--L8--3of6--1-02059--RA--SeqInst3--R1.fq.gz...     ||
|| The input file contains base space reads.                                  ||
|| Decompress 120521--FC427--L8--3of6--1-02059--RA--SeqInst3--R2.fq.gz...     ||
|| WARNING The specified phred-score offset (33) seems to be incorrect.       ||
||         The observed phred-score range is [35,84].                         ||
||                                                                            ||

-------
ABORTED
-------

> require(Rsubread)
> sessionInfo()
R version 3.1.1 (2014-07-10)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
[1] C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] Rsubread_1.15.9

rsubread bug alignment • 2.3k views

ADD COMMENT • link updated 10.3 years ago by Wei Shi ★ 3.6k • written 10.3 years ago by caufeminecraft ▴ 20

1

Entering edit mode

Wei Shi ★ 3.6k

@wei-shi-2183

Last seen 11 weeks ago

Australia/Melbourne

Thanks for the detailed report for the problem you encountered. Rsubread package includes a function called 'qualityScores' which can be used to retrieve Phred scores for read bases in the fastq files. Could you run the following commands to see what you will get:

library(Rsubread)

qs <- qualityScores("120521--FC427--L8--3of6--1-02059--RA--SeqInst3--R1.fq.gz",offset=33)

summary(qs)

Thanks,

Wei

ADD COMMENT • link 10.3 years ago Wei Shi ★ 3.6k

0

Entering edit mode

caufeminecraft ▴ 20

@caufeminecraft-7026

Last seen 10.3 years ago

United States

Thanks for your help. I ran the commands you suggested, the output of which I am including below. I am also doing the same for the second file.

First File (--R1.fq.gz)

qualityScores Rsubread 1.15.9

Scan the input file...
  1000000 reads have been scanned in 13.9 seconds.
  2000000 reads have been scanned in 27.8 seconds.
  3000000 reads have been scanned in 41.6 seconds.
  4000000 reads have been scanned in 55.5 seconds.
  5000000 reads have been scanned in 69.3 seconds.
  6000000 reads have been scanned in 83.1 seconds.
Totally 6842471 reads were scanned; the sampling interval is 684.
Now extract read quality information...
  1000000 reads have been scanned in 108.8 seconds.
  2000000 reads have been scanned in 122.8 seconds.
  3000000 reads have been scanned in 136.8 seconds.
  4000000 reads have been scanned in 150.9 seconds.
  5000000 reads have been scanned in 164.9 seconds.
  6000000 reads have been scanned in 178.9 seconds.

Completed successfully. Quality scores for 10000 reads (equally spaced in the file) are returned.

The text alignment for the summary might be a bit off, but I hope that it is nonetheless useful. If you need more information, I will try to provide it in a prompt fashion. I ran the command twice, once with the default value of nreads (10000) and a second time with the total number of reads (6842471, according to the progress output).

10000 Reads

       1               2               3               4      
Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00
1st Qu.:16.00   1st Qu.:10.00   1st Qu.:12.00   1st Qu.:17.00
Median :28.00   Median :28.00   Median :28.00   Median :32.00
Mean   :22.66   Mean   :21.99   Mean   :22.16   Mean   :25.25
3rd Qu.:31.00   3rd Qu.:31.00   3rd Qu.:31.00   3rd Qu.:35.00
Max.   :34.00   Max.   :34.00   Max.   :34.00   Max.   :37.00
       5               6               7               8      
Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00
1st Qu.:10.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00
Median :31.00   Median :31.00   Median :30.00   Median :30.00
Mean   :24.07   Mean   :23.48   Mean   :21.92   Mean   :22.28
3rd Qu.:35.00   3rd Qu.:35.00   3rd Qu.:35.00   3rd Qu.:35.00
Max.   :37.00   Max.   :37.00   Max.   :37.00   Max.   :37.00
       9               10              11              12              13    
Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.0
1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.0
Median :30.00   Median :30.00   Median :30.00   Median :30.00   Median :30.0
Mean   :22.93   Mean   :22.92   Mean   :23.06   Mean   :23.04   Mean   :22.8
3rd Qu.:35.00   3rd Qu.:35.00   3rd Qu.:37.00   3rd Qu.:37.00   3rd Qu.:37.0
Max.   :39.00   Max.   :39.00   Max.   :39.00   Max.   :39.00   Max.   :39.0
       14              15              16              17     
Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00
1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00
Median :30.00   Median :30.00   Median :30.00   Median :30.00
Mean   :23.22   Mean   :22.95   Mean   :22.98   Mean   :23.03
3rd Qu.:37.00   3rd Qu.:37.00   3rd Qu.:37.00   3rd Qu.:37.00
Max.   :41.00   Max.   :41.00   Max.   :41.00   Max.   :41.00
       18              19              20              21     
Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00
1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00
Median :30.00   Median :29.00   Median :30.00   Median :29.00
Mean   :22.95   Mean   :22.55   Mean   :22.67   Mean   :22.25
3rd Qu.:37.00   3rd Qu.:37.00   3rd Qu.:37.00   3rd Qu.:37.00
Max.   :41.00   Max.   :41.00   Max.   :41.00   Max.   :41.00
       22              23              24              25     
Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00
1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00
Median :30.00   Median :29.00   Median :29.00   Median :29.00
Mean   :22.52   Mean   :22.21   Mean   :22.14   Mean   :22.11
3rd Qu.:37.00   3rd Qu.:37.00   3rd Qu.:37.00   3rd Qu.:37.00
Max.   :41.00   Max.   :41.00   Max.   :41.00   Max.   :41.00
       26              27              28              29     
Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00
1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00
Median :28.00   Median :28.00   Median :28.00   Median :27.00
Mean   :21.78   Mean   :21.69   Mean   :21.65   Mean   :21.23
3rd Qu.:37.00   3rd Qu.:36.00   3rd Qu.:36.00   3rd Qu.:36.00
Max.   :41.00   Max.   :41.00   Max.   :41.00   Max.   :41.00
       30              31              32             33              34    
Min.   : 2.00   Min.   : 2.00   Min.   : 2.0   Min.   : 2.00   Min.   : 2.0
1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.0   1st Qu.: 2.00   1st Qu.: 2.0
Median :28.00   Median :28.00   Median :27.0   Median :27.00   Median :28.0
Mean   :21.42   Mean   :21.35   Mean   :21.1   Mean   :20.99   Mean   :21.1
3rd Qu.:36.00   3rd Qu.:36.00   3rd Qu.:36.0   3rd Qu.:36.00   3rd Qu.:36.0
Max.   :41.00   Max.   :41.00   Max.   :41.0   Max.   :41.00   Max.   :41.0
       35              36              37              38     
Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00
1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00
Median :27.00   Median :27.00   Median :27.00   Median :27.00
Mean   :20.84   Mean   :20.88   Mean   :20.59   Mean   :20.49
3rd Qu.:36.00   3rd Qu.:36.00   3rd Qu.:36.00   3rd Qu.:36.00
Max.   :41.00   Max.   :41.00   Max.   :41.00   Max.   :41.00
       39              40              41              42     
Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00
1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00
Median :26.00   Median :26.00   Median :27.00   Median :26.00
Mean   :20.41   Mean   :20.42   Mean   :20.41   Mean   :20.05
3rd Qu.:36.00   3rd Qu.:36.00   3rd Qu.:36.00   3rd Qu.:36.00
Max.   :41.00   Max.   :41.00   Max.   :41.00   Max.   :41.00
       43              44              45              46     
Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00
1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00
Median :26.00   Median :26.00   Median :24.00   Median :24.00
Mean   :19.97   Mean   :19.81   Mean   :19.41   Mean   :19.29
3rd Qu.:35.00   3rd Qu.:35.00   3rd Qu.:35.00   3rd Qu.:35.00
Max.   :41.00   Max.   :41.00   Max.   :41.00   Max.   :41.00
       47              48              49              50     
Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00
1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00
Median :23.00   Median :23.00   Median :21.00   Median :18.00
Mean   :18.94   Mean   :18.87   Mean   :18.53   Mean   :18.01
3rd Qu.:35.00   3rd Qu.:35.00   3rd Qu.:35.00   3rd Qu.:34.00
Max.   :41.00   Max.   :41.00   Max.   :41.00   Max.   :41.00

Second time around:

qualityScores Rsubread 1.15.9

Scan the input file...
  1000000 reads have been scanned in 14.2 seconds.
  2000000 reads have been scanned in 28.2 seconds.
  3000000 reads have been scanned in 42.1 seconds.
  4000000 reads have been scanned in 56.0 seconds.
  5000000 reads have been scanned in 69.8 seconds.
  6000000 reads have been scanned in 83.7 seconds.
Totally 6842471 reads were scanned; the sampling interval is 1.
Now extract read quality information...
  1000000 reads have been scanned in 120.0 seconds.
  2000000 reads have been scanned in 144.6 seconds.
  3000000 reads have been scanned in 169.3 seconds.
  4000000 reads have been scanned in 193.9 seconds.
  5000000 reads have been scanned in 218.9 seconds.
  6000000 reads have been scanned in 243.6 seconds.

Completed successfully. Quality scores for 6842470 reads (equally spaced in the file) are returned.

6842470 Reads

       1               2               3               4     
Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00
1st Qu.:16.00   1st Qu.:12.00   1st Qu.:16.00   1st Qu.:17.00
Median :28.00   Median :28.00   Median :28.00   Median :32.00
Mean   :22.81   Mean   :22.16   Mean   :22.37   Mean   :25.35
3rd Qu.:31.00   3rd Qu.:31.00   3rd Qu.:31.00   3rd Qu.:35.00
Max.   :34.00   Max.   :34.00   Max.   :34.00   Max.   :37.00
       5               6              7               8               9     
Min.   : 2.00   Min.   : 2.0   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00
1st Qu.:10.00   1st Qu.: 8.0   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00
Median :32.00   Median :31.0   Median :30.00   Median :30.00   Median :30.00
Mean   :24.21   Mean   :23.7   Mean   :22.05   Mean   :22.49   Mean   :23.03
3rd Qu.:35.00   3rd Qu.:35.0   3rd Qu.:35.00   3rd Qu.:35.00   3rd Qu.:35.00
Max.   :37.00   Max.   :37.0   Max.   :37.00   Max.   :37.00   Max.   :39.00
       10              11              12              13    
Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00
1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00
Median :30.00   Median :30.00   Median :30.00   Median :30.00
Mean   :23.03   Mean   :23.12   Mean   :23.15   Mean   :22.93
3rd Qu.:35.00   3rd Qu.:37.00   3rd Qu.:37.00   3rd Qu.:37.00
Max.   :39.00   Max.   :39.00   Max.   :39.00   Max.   :39.00
       14              15              16              17    
Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00
1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00
Median :30.00   Median :30.00   Median :30.00   Median :30.00
Mean   :23.36   Mean   :23.04   Mean   :23.17   Mean   :23.18
3rd Qu.:37.00   3rd Qu.:37.00   3rd Qu.:37.00   3rd Qu.:37.00
Max.   :41.00   Max.   :41.00   Max.   :41.00   Max.   :41.00
       18              19              20              21    
Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00
1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00
Median :30.00   Median :30.00   Median :30.00   Median :29.00
Mean   :23.17   Mean   :22.79   Mean   :22.83   Mean   :22.45
3rd Qu.:37.00   3rd Qu.:37.00   3rd Qu.:37.00   3rd Qu.:37.00
Max.   :41.00   Max.   :41.00   Max.   :41.00   Max.   :41.00
       22              23              24              25    
Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00
1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00
Median :30.00   Median :30.00   Median :29.00   Median :29.00
Mean   :22.67   Mean   :22.41   Mean   :22.21   Mean   :22.18
3rd Qu.:37.00   3rd Qu.:37.00   3rd Qu.:37.00   3rd Qu.:37.00
Max.   :41.00   Max.   :41.00   Max.   :41.00   Max.   :41.00
       26              27              28              29    
Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00
1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00
Median :29.00   Median :28.00   Median :28.00   Median :27.00
Mean   :21.95   Mean   :21.77   Mean   :21.79   Mean   :21.34
3rd Qu.:37.00   3rd Qu.:37.00   3rd Qu.:37.00   3rd Qu.:36.00
Max.   :41.00   Max.   :41.00   Max.   :41.00   Max.   :41.00
       30              31              32              33    
Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00
1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00
Median :28.00   Median :28.00   Median :27.00   Median :27.00
Mean   :21.59   Mean   :21.48   Mean   :21.19   Mean   :21.08
3rd Qu.:36.00   3rd Qu.:36.00   3rd Qu.:36.00   3rd Qu.:36.00
Max.   :41.00   Max.   :41.00   Max.   :41.00   Max.   :41.00
       34              35           36              37              38    
Min.   : 2.00   Min.   : 2   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00
1st Qu.: 2.00   1st Qu.: 2   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00
Median :27.00   Median :27   Median :27.00   Median :27.00   Median :27.00
Mean   :21.13   Mean   :21   Mean   :20.99   Mean   :20.67   Mean   :20.65
3rd Qu.:36.00   3rd Qu.:36   3rd Qu.:36.00   3rd Qu.:36.00   3rd Qu.:36.00
Max.   :41.00   Max.   :41   Max.   :41.00   Max.   :41.00   Max.   :41.00
       39              40              41              42    
Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00
1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00
Median :27.00   Median :27.00   Median :27.00   Median :26.00
Mean   :20.52   Mean   :20.53   Mean   :20.49   Mean   :20.17
3rd Qu.:36.00   3rd Qu.:36.00   3rd Qu.:36.00   3rd Qu.:36.00
Max.   :41.00   Max.   :41.00   Max.   :41.00   Max.   :41.00
       43              44              45              46    
Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00
1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00
Median :26.00   Median :26.00   Median :25.00   Median :24.00
Mean   :20.05   Mean   :19.94   Mean   :19.66   Mean   :19.43
3rd Qu.:36.00   3rd Qu.:35.00   3rd Qu.:35.00   3rd Qu.:35.00
Max.   :41.00   Max.   :41.00   Max.   :41.00   Max.   :41.00
       47              48              49              50    
Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00
1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00
Median :24.00   Median :23.00   Median :22.00   Median :19.00
Mean   :19.14   Mean   :18.98   Mean   :18.69   Mean   :18.15
3rd Qu.:35.00   3rd Qu.:35.00   3rd Qu.:35.00   3rd Qu.:35.00
Max.   :41.00   Max.   :41.00   Max.   :41.00   Max.   :41.00

Second File (--R2.fq.gz)

Output:

 
qualityScores Rsubread 1.15.9

Scan the input file...
  1000000 reads have been scanned in 13.9 seconds.
  2000000 reads have been scanned in 27.5 seconds.
  3000000 reads have been scanned in 41.5 seconds.
  4000000 reads have been scanned in 55.6 seconds.
  5000000 reads have been scanned in 70.5 seconds.
  6000000 reads have been scanned in 84.7 seconds.
Totally 6842471 reads were scanned; the sampling interval is 684.
Now extract read quality information...
  1000000 reads have been scanned in 111.1 seconds.
  2000000 reads have been scanned in 125.1 seconds.
  3000000 reads have been scanned in 139.6 seconds.
  4000000 reads have been scanned in 154.0 seconds.
  5000000 reads have been scanned in 168.3 seconds.
  6000000 reads have been scanned in 182.7 seconds.

Completed successfully. Quality scores for 10000 reads (equally spaced in the file) are returned.

10000 Reads

       1               2                3               4        
 Min.   : 2.00   Min.   : 2.000   Min.   : 2.00   Min.   : 2.00  
 1st Qu.: 2.00   1st Qu.: 2.000   1st Qu.: 2.00   1st Qu.: 2.00  
 Median :10.00   Median : 2.000   Median :10.00   Median :19.00  
 Mean   :14.57   Mean   : 9.767   Mean   :12.98   Mean   :16.19  
 3rd Qu.:30.00   3rd Qu.:16.000   3rd Qu.:25.00   3rd Qu.:32.00  
 Max.   :34.00   Max.   :34.000   Max.   :34.00   Max.   :37.00  
       5               6               7                8        
 Min.   : 2.00   Min.   : 2.00   Min.   : 2.000   Min.   : 2.00  
 1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.000   1st Qu.: 2.00  
 Median :10.00   Median :17.00   Median : 2.000   Median :10.00  
 Mean   :15.22   Mean   :16.83   Mean   : 9.757   Mean   :13.72  
 3rd Qu.:32.00   3rd Qu.:32.00   3rd Qu.:16.000   3rd Qu.:17.00  
 Max.   :37.00   Max.   :37.00   Max.   :37.000   Max.   :37.00  
       9               10              11              12       
 Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00  
 1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00  
 Median :10.00   Median :10.00   Median :17.00   Median :17.00  
 Mean   :14.33   Mean   :14.69   Mean   :17.02   Mean   :17.92  
 3rd Qu.:27.00   3rd Qu.:28.00   3rd Qu.:32.00   3rd Qu.:33.00  
 Max.   :39.00   Max.   :39.00   Max.   :39.00   Max.   :39.00  
       13              14              15              16       
 Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00  
 1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00  
 Median :11.00   Median :10.00   Median :10.00   Median :10.00  
 Mean   :17.16   Mean   :15.93   Mean   :15.63   Mean   :16.17  
 3rd Qu.:34.00   3rd Qu.:32.00   3rd Qu.:32.00   3rd Qu.:32.00  
 Max.   :39.00   Max.   :41.00   Max.   :41.00   Max.   :41.00  
       17              18              19              20       
 Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00  
 1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00  
 Median :10.00   Median : 2.00   Median :11.00   Median :10.00  
 Mean   :15.92   Mean   :15.01   Mean   :16.48   Mean   :15.91  
 3rd Qu.:32.00   3rd Qu.:32.00   3rd Qu.:32.00   3rd Qu.:32.00  
 Max.   :41.00   Max.   :41.00   Max.   :41.00   Max.   :41.00  
       21              22              23              24       
 Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00  
 1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00  
 Median :10.00   Median :10.00   Median : 9.00   Median : 9.00  
 Mean   :16.59   Mean   :17.14   Mean   :16.18   Mean   :16.11  
 3rd Qu.:32.00   3rd Qu.:34.00   3rd Qu.:33.00   3rd Qu.:33.00  
 Max.   :41.00   Max.   :41.00   Max.   :41.00   Max.   :41.00  
       25              26              27             28              29       
 Min.   : 2.00   Min.   : 2.00   Min.   : 2.0   Min.   : 2.00   Min.   : 2.00  
 1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.0   1st Qu.: 2.00   1st Qu.: 2.00  
 Median : 2.00   Median : 9.00   Median : 8.0   Median : 8.00   Median : 2.00  
 Mean   :15.67   Mean   :16.39   Mean   :15.6   Mean   :16.01   Mean   :15.87  
 3rd Qu.:33.00   3rd Qu.:33.00   3rd Qu.:32.0   3rd Qu.:33.00   3rd Qu.:34.00  
 Max.   :41.00   Max.   :41.00   Max.   :41.0   Max.   :41.00   Max.   :41.00  
       30              31              32              33       
 Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00  
 1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00  
 Median : 8.00   Median : 8.00   Median : 8.00   Median : 8.00  
 Mean   :16.19   Mean   :15.99   Mean   :16.44   Mean   :16.25  
 3rd Qu.:33.00   3rd Qu.:33.00   3rd Qu.:33.00   3rd Qu.:33.00  
 Max.   :41.00   Max.   :41.00   Max.   :41.00   Max.   :41.00  
       34              35              36              37              38      
 Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.0  
 1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.0  
 Median : 8.00   Median : 2.00   Median : 2.00   Median : 2.00   Median : 2.0  
 Mean   :16.25   Mean   :15.09   Mean   :15.36   Mean   :15.53   Mean   :15.5  
 3rd Qu.:33.00   3rd Qu.:33.00   3rd Qu.:33.00   3rd Qu.:33.00   3rd Qu.:33.0  
 Max.   :41.00   Max.   :41.00   Max.   :41.00   Max.   :41.00   Max.   :41.0  
       39              40              41              42       
 Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00  
 1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00  
 Median : 2.00   Median : 2.00   Median : 2.00   Median : 2.00  
 Mean   :14.97   Mean   :14.39   Mean   :14.98   Mean   :15.05  
 3rd Qu.:32.00   3rd Qu.:31.00   3rd Qu.:32.00   3rd Qu.:33.00  
 Max.   :41.00   Max.   :41.00   Max.   :41.00   Max.   :41.00  
       43              44              45              46       
 Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00  
 1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00  
 Median : 2.00   Median : 2.00   Median : 2.00   Median : 2.00  
 Mean   :14.96   Mean   :14.96   Mean   :14.96   Mean   :14.79  
 3rd Qu.:33.00   3rd Qu.:33.00   3rd Qu.:33.00   3rd Qu.:33.00  
 Max.   :41.00   Max.   :41.00   Max.   :41.00   Max.   :41.00  
       47              48              49             50       
 Min.   : 2.00   Min.   : 2.00   Min.   : 2.0   Min.   : 2.00  
 1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.0   1st Qu.: 2.00  
 Median : 2.00   Median : 2.00   Median : 2.0   Median : 2.00  
 Mean   :14.41   Mean   :13.95   Mean   :13.5   Mean   :12.97  
 3rd Qu.:33.00   3rd Qu.:32.00   3rd Qu.:32.0   3rd Qu.:31.00  
 Max.   :41.00   Max.   :41.00   Max.   :41.0   Max.   :41.00

Second time around:

qualityScores Rsubread 1.15.9

Scan the input file...
  1000000 reads have been scanned in 14.0 seconds.
  2000000 reads have been scanned in 27.7 seconds.
  3000000 reads have been scanned in 41.8 seconds.
  4000000 reads have been scanned in 55.7 seconds.
  5000000 reads have been scanned in 69.7 seconds.
  6000000 reads have been scanned in 83.8 seconds.
Totally 6842471 reads were scanned; the sampling interval is 1.
Now extract read quality information...
  1000000 reads have been scanned in 120.2 seconds.
  2000000 reads have been scanned in 144.7 seconds.
  3000000 reads have been scanned in 169.6 seconds.
  4000000 reads have been scanned in 194.5 seconds.
  5000000 reads have been scanned in 219.4 seconds.
  6000000 reads have been scanned in 244.2 seconds.

Completed successfully. Quality scores for 6842470 reads (equally spaced in the file) are returned.

6842470 Reads

       1               2                3               4        
 Min.   : 2.00   Min.   : 2.000   Min.   : 2.00   Min.   : 2.00  
 1st Qu.: 2.00   1st Qu.: 2.000   1st Qu.: 2.00   1st Qu.: 2.00  
 Median :10.00   Median : 2.000   Median :10.00   Median :19.00  
 Mean   :14.69   Mean   : 9.695   Mean   :12.99   Mean   :16.14  
 3rd Qu.:30.00   3rd Qu.:16.000   3rd Qu.:25.00   3rd Qu.:31.00  
 Max.   :34.00   Max.   :34.000   Max.   :34.00   Max.   :37.00  
       5              6               7                8        
 Min.   : 2.0   Min.   : 2.00   Min.   : 2.000   Min.   : 2.00  
 1st Qu.: 2.0   1st Qu.: 2.00   1st Qu.: 2.000   1st Qu.: 2.00  
 Median :10.0   Median :17.00   Median : 2.000   Median :10.00  
 Mean   :15.1   Mean   :16.75   Mean   : 9.813   Mean   :13.71  
 3rd Qu.:32.0   3rd Qu.:32.00   3rd Qu.:17.000   3rd Qu.:17.00  
 Max.   :37.0   Max.   :37.00   Max.   :37.000   Max.   :37.00  
       9               10              11              12       
 Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00  
 1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00  
 Median :10.00   Median :10.00   Median :17.00   Median :17.00  
 Mean   :14.31   Mean   :14.77   Mean   :17.08   Mean   :18.01  
 3rd Qu.:27.00   3rd Qu.:28.00   3rd Qu.:32.00   3rd Qu.:33.00  
 Max.   :39.00   Max.   :39.00   Max.   :39.00   Max.   :39.00  
       13              14              15              16       
 Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00  
 1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00  
 Median :11.00   Median :10.00   Median :10.00   Median :10.00  
 Mean   :17.28   Mean   :16.13   Mean   :15.62   Mean   :16.29  
 3rd Qu.:34.00   3rd Qu.:33.00   3rd Qu.:32.00   3rd Qu.:32.00  
 Max.   :39.00   Max.   :41.00   Max.   :41.00   Max.   :41.00  
       17              18              19              20       
 Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00  
 1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00  
 Median :10.00   Median : 2.00   Median :16.00   Median :10.00  
 Mean   :15.98   Mean   :15.18   Mean   :16.65   Mean   :15.92  
 3rd Qu.:32.00   3rd Qu.:32.00   3rd Qu.:32.00   3rd Qu.:32.00  
 Max.   :41.00   Max.   :41.00   Max.   :41.00   Max.   :41.00  
       21              22              23             24              25       
 Min.   : 2.00   Min.   : 2.00   Min.   : 2.0   Min.   : 2.00   Min.   : 2.00  
 1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.0   1st Qu.: 2.00   1st Qu.: 2.00  
 Median :10.00   Median :10.00   Median :10.0   Median : 9.00   Median : 2.00  
 Mean   :16.53   Mean   :17.22   Mean   :16.3   Mean   :16.19   Mean   :15.76  
 3rd Qu.:32.00   3rd Qu.:34.00   3rd Qu.:33.0   3rd Qu.:33.00   3rd Qu.:32.00  
 Max.   :41.00   Max.   :41.00   Max.   :41.0   Max.   :41.00   Max.   :41.00  
       26              27              28              29       
 Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00  
 1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00  
 Median :10.00   Median : 8.00   Median : 8.00   Median : 2.00  
 Mean   :16.49   Mean   :15.68   Mean   :16.19   Mean   :15.91  
 3rd Qu.:33.00   3rd Qu.:32.00   3rd Qu.:33.00   3rd Qu.:33.00  
 Max.   :41.00   Max.   :41.00   Max.   :41.00   Max.   :41.00  
       30              31              32              33              34      
 Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.0  
 1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.0  
 Median : 8.00   Median : 8.00   Median : 8.00   Median : 8.00   Median : 8.0  
 Mean   :16.27   Mean   :16.07   Mean   :16.56   Mean   :16.35   Mean   :16.3  
 3rd Qu.:33.00   3rd Qu.:33.00   3rd Qu.:33.00   3rd Qu.:33.00   3rd Qu.:33.0  
 Max.   :41.00   Max.   :41.00   Max.   :41.00   Max.   :41.00   Max.   :41.0  
       35              36             37              38              39       
 Min.   : 2.00   Min.   : 2.0   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00  
 1st Qu.: 2.00   1st Qu.: 2.0   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00  
 Median : 2.00   Median : 2.0   Median : 2.00   Median : 2.00   Median : 2.00  
 Mean   :15.19   Mean   :15.6   Mean   :15.61   Mean   :15.68   Mean   :15.13  
 3rd Qu.:33.00   3rd Qu.:33.0   3rd Qu.:33.00   3rd Qu.:33.00   3rd Qu.:32.00  
 Max.   :41.00   Max.   :41.0   Max.   :41.00   Max.   :41.00   Max.   :41.00  
       40              41              42              43       
 Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00  
 1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00  
 Median : 2.00   Median : 2.00   Median : 2.00   Median : 2.00  
 Mean   :14.46   Mean   :15.02   Mean   :15.15   Mean   :15.12  
 3rd Qu.:31.00   3rd Qu.:31.00   3rd Qu.:33.00   3rd Qu.:33.00  
 Max.   :41.00   Max.   :41.00   Max.   :41.00   Max.   :41.00  
       44              45              46              47              48      
 Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.00   Min.   : 2.0  
 1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.00   1st Qu.: 2.0  
 Median : 2.00   Median : 2.00   Median : 2.00   Median : 2.00   Median : 2.0  
 Mean   :15.13   Mean   :15.15   Mean   :14.89   Mean   :14.66   Mean   :14.1  
 3rd Qu.:33.00   3rd Qu.:33.00   3rd Qu.:33.00   3rd Qu.:33.00   3rd Qu.:33.0  
 Max.   :41.00   Max.   :41.00   Max.   :41.00   Max.   :41.00   Max.   :41.0  
       49              50       
 Min.   : 2.00   Min.   : 2.00  
 1st Qu.: 2.00   1st Qu.: 2.00  
 Median : 2.00   Median : 2.00  
 Mean   :13.65   Mean   :13.17  
 3rd Qu.:32.00   3rd Qu.:31.00  
 Max.   :41.00   Max.   :41.00

ADD COMMENT • link 10.3 years ago caufeminecraft ▴ 20

score 3 · Accepted Answer · 2014-11-12

Thanks for running qualityScores function and providing its output. The qualityScores ouptut shows that the Phred scores of your data are within the range between 0 and 41, which is typical for the sequence data generated by Illumina sequencers and other sequencers. The offset that was added to the Phred scores of your data by the sequencer is 33.

Sorry the warning message in the align() screen output is misleading, which is due to a bug in the code when testing whether the offset should be 33 or 64. Note that the align() function read in read sequences and quality scores correctly.

You can simply ignore this warning message. The align() function always uses the offset specified by the user for the alignments. So for your data, the offset 33 was correctly used. Therefore the mapping results you got should be correct.