Rsamtools: readGAlignmentsFromBam weird behavior reading MAPQ and RNEXT fields
1
0
Entering edit mode
Guest User ★ 13k
@guest-user-4897
Last seen 9.6 years ago
Hi, I'm using readGAlignmentsFromBam where I defined the bam parameters to be scanned as follows: class: ScanBamParam bamFlag (NA unless specified): bamSimpleCigar: FALSE bamReverseComplement: FALSE bamTag: NH, HI, AS, nM bamWhich: 0 elements bamWhat: flag, mapq, mrnm, mpos, seq, qual Using an example of a single alignment record I get the following: ILLUMINA:223:D0CUVACXX:5:1104:11885:187863 0 chr11 96806772 255 44M * 0 0 GGCTGACCTGGTCCCTGCCAAGGAAGCCAATGTCAAGTGCCCAC JJJIJJJJJJJJJJJIJIJIJJJHJIJJJJJIJJJJJGHIIIJJ NH:i:1 HI:i:1 AS:i:43 nM:i:0 seqnames strand cigar qwidth start end width ngap | flag mapq mrnm mpos <rle> <rle> <character> <integer> <integer> <integer> <integer> <integer> | <integer> <integer> <factor> <integer> ILLUMINA:223:D0CUVACXX:5:1104:11885:187863 chr11 + 44M 44 96806772 96806815 44 0 | 0 <na> <na> 0 seq qual NH HI AS nM <dnastringset> <phredquality> <integer> <integer> <integer> <integer> ILLUMINA:223:D0CUVACXX:5:1104:11885:187863 GGCTGACCTGGTCCCTGCCAAGGAAGCCAATGTCAAGTGCCCAC JJJIJJJJJJJJJJJIJIJIJJJHJIJJJJJIJJJJJGHIIIJJ 1 1 43 0 As you can see the MAPQ field in the input is 255 but is NA in the gapped alignment object, although mapq is defined in the ScanBamParam. The RNEXT field is also NA but I guess this is the representation of its "*" value in the bam record. When I change the MAPQ in the input to be different from 255 (e.g., 3) this doesn't happen. -- output of sessionInfo(): R version 3.0.2 (2013-09-25) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] parallel stats graphics grDevices utils datasets methods base other attached packages: [1] data.table_1.8.10 Rsamtools_1.13.44 Biostrings_2.29.19 GenomicRanges_1.13.45 XVector_0.1.4 IRanges_1.19.38 BiocGenerics_0.7.5 loaded via a namespace (and not attached): [1] bitops_1.0-6 stats4_3.0.2 tools_3.0.2 zlibbioc_1.7.0 -- Sent via the guest posting facility at bioconductor.org.
Alignment Alignment • 1.1k views
ADD COMMENT
0
Entering edit mode
@herve-pages-1542
Last seen 13 hours ago
Seattle, WA, United States
Hi Rubi, On 10/14/2013 04:41 PM, rubi [guest] wrote: > > Hi, > > I'm using readGAlignmentsFromBam where I defined the bam parameters to be scanned as follows: > class: ScanBamParam > bamFlag (NA unless specified): > bamSimpleCigar: FALSE > bamReverseComplement: FALSE > bamTag: NH, HI, AS, nM > bamWhich: 0 elements > bamWhat: flag, mapq, mrnm, mpos, seq, qual > > Using an example of a single alignment record I get the following: > ILLUMINA:223:D0CUVACXX:5:1104:11885:187863 0 chr11 96806772 255 44M * 0 0 GGCTGACCTGGTCCCTGCCAAGGAAGCCAATGTCAAGTGCCCAC JJJIJJJJJJJJJJJIJIJIJJJHJIJJJJJIJJJJJGHIIIJJ NH:i:1 HI:i:1 AS:i:43 nM:i:0 > > seqnames strand cigar qwidth start end width ngap | flag mapq mrnm mpos > <rle> <rle> <character> <integer> <integer> <integer> <integer> <integer> | <integer> <integer> <factor> <integer> > ILLUMINA:223:D0CUVACXX:5:1104:11885:187863 chr11 + 44M 44 96806772 96806815 44 0 | 0 <na> <na> 0 > > seq qual NH HI AS nM > <dnastringset> <phredquality> <integer> <integer> <integer> <integer> > ILLUMINA:223:D0CUVACXX:5:1104:11885:187863 GGCTGACCTGGTCCCTGCCAAGGAAGCCAATGTCAAGTGCCCAC JJJIJJJJJJJJJJJIJIJIJJJHJIJJJJJIJJJJJGHIIIJJ 1 1 43 0 > > > As you can see the MAPQ field in the input is 255 but is NA in the gapped alignment object, although mapq is defined in the ScanBamParam. The RNEXT field is also NA but I guess this is the representation of its "*" value in the bam record. > > When I change the MAPQ in the input to be different from 255 (e.g., 3) this doesn't happen. This is the intended behavior. Like "*" means the RNEXT is not available, 255 means that MAPQ is non-available. From the SAM Spec (http://samtools.sourceforge.net/SAMv1.pdf): 5. MAPQ: MAPping Quality. It equals -10*log10(Pr{mapping position is wrong}), rounded to the nearest integer. A value 255 indicates that the mapping quality is not available. Cheers, H. > > -- output of sessionInfo(): > > R version 3.0.2 (2013-09-25) > Platform: x86_64-w64-mingw32/x64 (64-bit) > > locale: > [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C > [5] LC_TIME=English_United States.1252 > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods base > > other attached packages: > [1] data.table_1.8.10 Rsamtools_1.13.44 Biostrings_2.29.19 GenomicRanges_1.13.45 XVector_0.1.4 IRanges_1.19.38 BiocGenerics_0.7.5 > > loaded via a namespace (and not attached): > [1] bitops_1.0-6 stats4_3.0.2 tools_3.0.2 zlibbioc_1.7.0 > > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319
ADD COMMENT

Login before adding your answer.

Traffic: 518 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6