Search
Question: How to get GenomicAlignments to read all reads?
0
gravatar for mikko.arvas
21 months ago by
mikko.arvas0 wrote:

Hi,

when looking at a bam with a genome viewer I see reads of:

-Mapping quality  over 60
-Insert size 170 kbp
-Cigar 100M
-Paired
-Primary alingment TRUE

When trying to read them to R:

> sbf <- scanBamFlag(isPaired = NA, isProperPair = NA, isUnmappedQuery = NA,
+             hasUnmappedMate = NA, isMinusStrand = NA, isMateMinusStrand = NA,
+             isFirstMateRead = NA, isSecondMateRead = NA, isSecondaryAlignment  = NA, isNotPassingQualityControls = NA,
+             isDuplicate = NA)
> sbf2 <- ScanBamParam(flag = sbf, mapqFilter=0)
> bam  = readGAlignmentPairs(file="file.bam",param = sbf2)
> bam
GAlignmentPairs object with 0 pairs, strandMode=1, and 0 metadata columns:
   seqnames strand :    ranges --    ranges
      <Rle>  <Rle> : <IRanges> -- <IRanges>
  -------
  seqinfo: 84 sequences from an unspecified genome

I see no reads. What am I missing in here?

With scanBamFlag above I try set up things so that everything would be read in. Is there some further parameter I should look into?

Best,

Mikko

 

 

R version 3.3.1 (2016-06-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.1 LTS

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=fi_FI.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=fi_FI.UTF-8   
 [6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=fi_FI.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=fi_FI.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] GenomicAlignments_1.10.0   Rsamtools_1.26.1           Biostrings_2.42.1          XVector_0.14.0             SummarizedExperiment_1.4.0
 [6] Biobase_2.34.0             GenomicRanges_1.26.2       GenomeInfoDb_1.10.2        IRanges_2.8.1              S4Vectors_0.12.1          
[11] BiocGenerics_0.20.0       

loaded via a namespace (and not attached):
[1] lattice_0.20-34    bitops_1.0-6       grid_3.3.1         zlibbioc_1.20.0    Matrix_1.2-8       BiocParallel_1.8.1 tools_3.3.1      

ADD COMMENTlink written 21 months ago by mikko.arvas0

It's difficult to know what the issue may be without seeing an example of your data.  Is it possible for you to make a small subset of your bam file available?

I would also try running the function using the default parameters, which are pretty inclusive and should get most reads, e.g.

readGAlignmentPairs('file.bam')

This should help identify if the problem is the function, the data, or the parameters.  Similarly you can try running readGAlignments() to make sure this isn't an issue with the pairing of your data.

readGAlignments('file.bam')
ADD REPLYlink written 21 months ago by Mike Smith3.0k

Thank you Mike!

> bam <- readGAlignmentPairs("test.bam")
> cvg <- coverage(bam)
> max(cvg$`6`)
[1] 1

Wrong.

> bam <- readGAlignments("test.bam")
> cvg <- coverage(bam)
> max(cvg$`6`)
[1] 50

OK!
This is what I see with Baseplayer https://www.cs.helsinki.fi/u/rkataine/BasePlayer/info.html Genome browser.

It looks like it is something in the pairing of the reads. Browser sees the pairs but readGAlignmentPairs() misses them.

If somebody is interested to have a look I can send a test file.

Best,

Mikko

ADD REPLYlink modified 21 months ago • written 21 months ago by mikko.arvas0

readGAlignmentList() might be informative -- it can include unmated reads as well as reads that are ambiguously mated. Identify reads that you think should have been mated. Use ScanBamParam() to access the information used for mating and described in ?readGAlignmentsList to identify what criterion prevents them from being mated.

ADD REPLYlink written 21 months ago by Martin Morgan ♦♦ 22k

If the workflow Martin suggests doesn't shed any light I can look at the files for you. Make the files (or subsets) available to valerie.obenchain@roswellpark.org. 

Valerie

ADD REPLYlink written 21 months ago by Valerie Obenchain ♦♦ 6.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 247 users visited in the last hour