Question: Problem with scanBam
1
gravatar for felix.klein
4.5 years ago by
felix.klein150
Germany
felix.klein150 wrote:

Hello,

somehow the scanBam function does not properly read in the mapq scores from this file:

HWI-ST227:407:C56MLACXX:1:1316:6877:64103    0    chr10    3153911    255    24M    *    0    0    CCCAATTAATGGGGGCAGGATAAG    HFHGHI@DAEI@GGIDEBGFGHHG    NH:i:1    HI:i:1    AS:i:19    nM:i:2
HWI-ST227:407:C56MLACXX:1:2102:2752:79037    0    chr10    3153911    255    24M    *    0    0    CCCAATTAATGGGGGCAGGATAAG    HGIJJIIJJIGHIJJJIJJIJJIG    NH:i:1    HI:i:1    AS:i:19    nM:i:2
HWI-ST227:407:C56MLACXX:1:2113:3448:85785    0    chr10    3175788    255    24M    *    0    0    TCTTTTAGATCTATTTACTCTGGA    ########################    NH:i:1    HI:i:1    AS:i:19    nM:i:2
HWI-ST227:407:C56MLACXX:1:1203:13148:73589    16    chr10    3180818    255    24M    *    0    0    TAGGCTTCAAGGAGAGTCTCCTGT    ?DBEIIEDD?99>C@DEEE9FFC4    NH:i:1    HI:i:1    AS:i:23    nM:i:0

They all should be 255 but are read in as NA.

The code I used was just:

scanBam("test.bam")[[1]]$mapq
 [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[34] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA

What could be the reason for this?

Best regards,

Felix

PS: How could I provide the test.bam file for testing?

scanbamparam scanbam • 570 views
ADD COMMENTlink modified 4.5 years ago by Martin Morgan ♦♦ 23k • written 4.5 years ago by felix.klein150
Answer: Problem with scanBam
1
gravatar for Martin Morgan
4.5 years ago by
Martin Morgan ♦♦ 23k
United States
Martin Morgan ♦♦ 23k wrote:

It's an intentional (maybe misguided?) interpretation of the SAM spec,

5. MAPQ: MAPping Quality. It equals −10 log10 Pr{mapping position is wrong},
   rounded to the nearest integer. A value 255 indicates that the mapping
   quality is not available.

translating the magic number '255' to the R representation 'NA'.

ADD COMMENTlink modified 4.5 years ago • written 4.5 years ago by Martin Morgan ♦♦ 23k

Martin I see your point, however the value 255 for MAPQ is often used (in particular by the aligner topHat and STAR) for uniquely mapped reads. 
This may not be a good practice ("No alignments should be assigned mapping quality 255" (SAM spec, "2 Recommended practice for the SAM format"), but it is used, maybe by default scanBAM could allow 255?

ADD REPLYlink written 4.5 years ago by samuel collombet140
1

I'll likely change the behavior, unless I hear arguments against...

ADD REPLYlink written 4.5 years ago by Martin Morgan ♦♦ 23k

Should be updated in Rsamtools 1.19.42, in svn now and hopefully via biocLite() after Sunday at 10am ish

ADD REPLYlink written 4.5 years ago by Martin Morgan ♦♦ 23k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 151 users visited in the last hour