possible bug resulting in warnings when reading tags with scanBam and ScanBamParam
1
0
Entering edit mode
@robert-k-bradley-5997
Last seen 8 months ago
United States

Hello,

I'm getting a strange warning message when I attempt to read in tag information from BAM files with scanBam. I haven't seen this warning in the past. I don't get the warning message if I ignore tags. The BAM file appears to be read in despite the warnings. Please see the following example, in which the file "test.bam" is the BAM-format version of this SAM file:

@HD     VN:1.0  SO:unsorted
@SQ     SN:WT   LN:180
@SQ     SN:Y92del       LN:180
@SQ     SN:R94_P95insR  LN:180
@SQ     SN:P95_R102del  LN:180
@SQ     SN:P95_D97del   LN:180
@SQ     SN:P96_D97insP  LN:180
@SQ     SN:S101fs*21    LN:180
@SQ     SN:G104fs*>118  LN:180
@PG     ID:bowtie2      PN:bowtie2      VN:2.2.9        CL:"/home/rbradley/projects/gene_expression/software/Linux.x86_64/bin/bowtie2-align-s --wrapper basic-0 -x /fh/scratch/delete30/bradley_r/scratch_space_rbradley/tmp/Rtmp3T3QV3/file27dd3730600b --quiet --end-to-end --sensitive -k 1 --threads 3 --passthrough -U /tmp/18622.unp"
SRR7236962.1491363      0       WT      58      255     100M    *       0       0       GCTGCGGGTGCAAATGGCGCGCTACGGCCGCCCCCCGGACTCACACCACAGCCGCCGGGGACCGCCACCCCGCAGGTACGGGGGCGGTGGCTACGGACGC   AAFFFKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKFKKK   AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:100        YT:Z:UU

The example follows:

> bam = scanBam ("~/test.bam", param = ScanBamParam (what = c ("qname", "rname"), tag = character (0)))
> bam = scanBam ("~/test.bam", param = ScanBamParam (what = c ("qname", "rname"), tag = c ("XN", "XM")))
Warning messages:
1: In doTryCatch(return(expr), name, parentenv, handler) :
  length of NULL cannot be changed
2: In doTryCatch(return(expr), name, parentenv, handler) :
  length of NULL cannot be changed

> bam
[[1]]
[[1]]$qname
[1] "SRR7236962.1491363"

[[1]]$rname
[1] WT
Levels: WT Y92del R94_P95insR P95_R102del P95_D97del P96_D97insP S101fs*21 G104fs*>118

[[1]]$tag
[[1]]$tag$XN
[1] 0

[[1]]$tag$XM
[1] 0
> sessionInfo()
R version 3.5.0 (2018-04-23)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.5

Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    parallel  grDevices utils     datasets  stats     graphics  methods   base     

other attached packages:
 [1] Rsamtools_1.32.0     Biostrings_2.48.0    XVector_0.20.0       GenomicRanges_1.32.3 GenomeInfoDb_1.16.0  IRanges_2.14.10      S4Vectors_0.18.3     BiocGenerics_0.26.0 
 [9] dplyr_0.7.6          tidyr_0.8.1          readr_1.1.1          tibble_1.4.2         magrittr_1.5        

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.17           bindr_0.1.1            zlibbioc_1.26.0        hms_0.4.2              BiocParallel_1.14.1    tidyselect_0.2.4       R6_2.2.2              
 [8] rlang_0.2.1            tools_3.5.0            assertthat_0.2.0       bindrcpp_0.2.2         GenomeInfoDbData_1.1.0 purrr_0.2.5            bitops_1.0-6          
[15] RCurl_1.95-4.10        glue_1.2.0             compiler_3.5.0         pillar_1.2.3           pkgconfig_2.0.1       

 

Any help/advice would be greatly appreciated.

Thank you,

Robert Bradley

 

Rsamtools • 1.2k views
ADD COMMENT
0
Entering edit mode
@martin-morgan-1513
Last seen 4 months ago
United States

Thanks, I think this is fixed in v. 1.32.1 (release 3.7)  / 1.33.1 (devel). These will be available via biocLite() in the next 24 - 48 hours, or via

biocLite("Bioconductor/Rsamtools", ref = "RELEASE_3_7")  # ref = "master"

 

ADD COMMENT
0
Entering edit mode

Martin, thank you for the quick response and fix. I updated Rsamtools per your instructions. I no longer gett the above warning message. However, I think that a new bug may have been introduced. I can read in the BAM version of the following SAM file without problems if I don't specify parameters:

@HD     VN:1.0  SO:unsorted
@SQ     SN:WT   LN:180
@SQ     SN:Y92del       LN:180
@SQ     SN:R94_P95insR  LN:180
@SQ     SN:P95_R102del  LN:180
@SQ     SN:P95_D97del   LN:180
@SQ     SN:P96_D97insP  LN:180
@SQ     SN:S101fs*21    LN:180
@SQ     SN:G104fs*>118  LN:180
@PG     ID:bowtie2      PN:bowtie2      VN:2.2.9        CL:"/home/rbradley/projects/gene_expression/software/Linux.x86_64/bin/bowtie2-align-s --wrapper basic-0 -x /fh/scratch/delete30/bradley_r/scratch_space_rbradley/tmp/Rtmp7ZFfDb/file54692b4266aa --quiet --end-to-end --sensitive -k 1 --threads 3 --passthrough -U /tmp/17762.unp"
SRR7236976.1053936      0       WT      42      255     100M    *       0       0       TGCTGGACGGCCGCGAGCTGCGGGTGCAAATGGCGCGCTACGGCCGCCCCCCGGACTCACACCACAGCCGCCGGGGACCGCCACCCCGCAGGTACGGGGG   AAFFFKKKKKKKKKKKKKKFKKKKKKKKKKK7KKFKKKKKKAFKKFKKKKKKKAFKKKKKKKKKKKKKKKKKK<<F<FF,FKKKKKKFKFKFFKKKKKKK   AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:100        YT:Z:UU
SRR7236977.7004537      16      WT      80      255     100M    *       0       0       TACGGCCGCCCCCCGGACTCACACCACAGCCGCCGGGGACCGCCACCCCGCAGGTACGGGGGCGGTGGCTACGGACGCCGGAGCCGCAGCCCTAGGCGGC   ,<(<A7,A<AFAFAF<,FA,<7,A7AK<FF7KKKFAAFFFFF,7,F<FFFAFKKF<FFA<F<AFKFAKKF<<KKKFFA7<FFKKKFKKKKKKKFFFFFFA   AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:100        YT:Z:UU
SRR7236977.7767786      16      WT      41      255     100M    *       0       0       GTGCTGGACGGCCGCGAGCTGCGGGTGCAAATGGCGCGCTACGGCCGCCCCCCGGACTCACACCACAGCCGCCGGGGACCGCCACCCCGCAGGTACGGGG   A7KF,<<AF<AKFA7A<F<<FA<FKAKFFAFFFKKKFA,AFFKKKKFF7AFKKKKKKK7KKKFKFFAKKFKF<<FFKF7FKFFA7<F7F<AAF7KFFFAA   AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:100        YT:Z:UU

However, if I attempt to read in tags, then the file isn't read correctly:

> sapply (scanBam ("~/test.bam", param = ScanBamParam (what = c ("qname", "rname"), tag = c ("XM", "XO", "XG")))[[1]][["tag"]], length)
     XM      XO      XG 
5242880 5242880 5242880 

 

ADD REPLY
0
Entering edit mode

My sessionInfo is below (I couldn't include it in the above response due to the character limit):

> sessionInfo()
R version 3.5.0 (2018-04-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.5 LTS

attached base packages:
 [1] stats4    parallel  grid      grDevices utils     datasets  stats     graphics  methods   base

other attached packages:
 [1] ggplot2_2.2.1                              scales_0.5.0                               broom_0.4.5                                dplyr_0.7
.6
 [5] plyr_1.8.4                                 GenomicAlignments_1.16.0                   Rsamtools_1.32.1                           BSgenome_
1.48.0
 [9] rtracklayer_1.40.2                         AnnotationHub_2.12.0                       VennDiagram_1.6.20                         futile.lo
gger_1.4.3
[13] RColorBrewer_1.1-2                         biomaRt_2.36.1                             goseq_1.32.0                               geneLenDa
taBase_1.16.0
[17] BiasedUrn_1.07                             GO.db_3.6.0                                beanplot_1.2                               IlluminaH
umanMethylation450kmanifest_0.4.0
[21] minfi_1.26.2                               bumphunter_1.22.0                          locfit_1.5-9.1                             iterators
_1.0.9
[29] DelayedArray_0.6.0                         BiocParallel_1.14.1                        matrixStats_0.53.1                         FDb.Infin
iumMethylation.hg19_2.2.0
[33] org.Hs.eg.db_3.6.0                         TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2    GenomicFeatures_1.32.0                     Annotatio
nDbi_1.42.1
[37] Biobase_2.40.0                             GenomicRanges_1.32.3                       GenomeInfoDb_1.16.0                        IRanges_2
.14.10
[41] S4Vectors_0.18.3                           BiocGenerics_0.26.0                        seqLogo_1.46.0                             gplots_3.
0.1
[45] fastcluster_1.1.25                         Rcpp_0.12.16                               mgcv_1.8-24                                nlme_3.1-
137
[49] BiocInstaller_1.30.0                       tidyr_0.8.1                                readr_1.1.1                                tibble_1.
4.2
[53] magrittr_1.5

loaded via a namespace (and not attached):
 [1] lazyeval_0.2.1                splines_3.5.0                 digest_0.6.15                 import_1.1.0                  htmltools_0.3.6    
 [6] gdata_2.18.0                  memoise_1.1.0                 limma_3.36.1                  annotate_1.58.0               siggenes_1.54.0    
[11] prettyunits_1.0.2             colorspace_1.3-2              blob_1.1.1                    RCurl_1.95-4.10               genefilter_1.62.0  
[16] bindr_0.1.1                   GEOquery_2.48.0               survival_2.42-3               glue_1.2.0                    gtable_0.2.0       
[21] registry_0.5                  zlibbioc_1.26.0               Rhdf5lib_1.2.0                HDF5Array_1.8.0               apcluster_1.4.7    
[26] futile.options_1.0.1          DBI_1.0.0                     rngtools_1.2.4                xtable_1.8-2                  progress_1.1.2     
[31] foreign_0.8-70                bit_1.1-12                    mclust_5.4                    preprocessCore_1.42.0         httr_1.3.1        
[36] pkgconfig_2.0.1               reshape_0.8.7                 XML_3.98-1.11                 reshape2_1.4.3                tidyselect_0.2.4
[41] rlang_0.2.0                   later_0.7.2                   munsell_0.4.3                 tools_3.5.0                   RSQLite_2.1.1      
[46] stringr_1.3.0                 yaml_2.1.19                   bit64_0.9-7                   caTools_1.17.1                purrr_0.2.4        
[51] bindrcpp_0.2.2                doRNG_1.6.6                   mime_0.5                      nor1mix_1.2-3                 formatR_1.5        
[56] xml2_1.2.0                    compiler_3.5.0                interactiveDisplayBase_1.18.0 stringi_1.2.2                 lattice_0.20-35    
[61] Matrix_1.2-14                 psych_1.8.4                   multtest_2.36.0               pillar_1.2.2                  data.table_1.11.2  
[66] bitops_1.0-6                  httpuv_1.4.2                  R6_2.2.2                      promises_1.0.1                KernSmooth_2.23-15 
[71] codetools_0.2-15              lambda.r_1.2.3                MASS_7.3-50                   gtools_3.5.0                  assertthat_0.2.0   
[76] rhdf5_2.24.0                  openssl_1.0.1                 pkgmaker_0.22                 mnormt_1.5-5                  GenomeInfoDbData_1.1.0
[81] hms_0.4.2                     quadprog_1.5-5                base64_2.0                    DelayedMatrixStats_1.2.0      illuminaio_0.22.0  
[86] shiny_1.0.
ADD REPLY
0
Entering edit mode

Sorry please try again.

ADD REPLY
0
Entering edit mode

Thank you! It's working without problems for me now.

ADD REPLY

Login before adding your answer.

Traffic: 542 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6