Question: Read in specific rows of a BAM file that match a certain RNAME
1
gravatar for K
3.7 years ago by
K50
United States
K50 wrote:

Hello,

I am trying to read in a BAM file using RSamtools into R. I only want to read in those rows that match a specific "RNAME" (eg:  I only want to read rows where RNAME = "ref2")

I have looked through the vignette and online resources, and this is what I tried so far.

Obviously my "which" argument for ScanBamParam is incorrect. Could someone help me correct this query ? 

If there is no way to do this, is my only choice reading in all the rows in BAM file, and then sub-setting based on my RNAME match ?

inputBam = "test.bam"
bamFile <- BamFile(inputBam)
seqInfo1 <- seqinfo(bamFile)

seqInfo1
Seqinfo object with 2 sequences from an unspecified genome:
    seqnames seqlengths isCircular genome
    ref1     159322         NA   <NA>
    ref2     162114         NA   <NA>

ref= "ref2"    
# trying to read only rows in BAM file where RNAME = "ref2"
params <- ScanBamParam(which = GRanges(ref,seqlengths = 162114), what = c("qname", "flag","rname","pos","mapq","cigar"))

Error in ScanBamParam(which = GRanges(ref, seqlengths = 162114), what = c("qname",  : error in evaluating the argument 'which' in selecting a method for function 'ScanBamParam': Error in asMethod(object) :  The character vector to convert to a GRanges object must contain strings of the form "chr1:2501-2800" or  "chr1:2501-2800:+" (".." being also supported as a separator between the start and end positions). Strand can be  "+", "-", "*", or missing.

temp <- Rsamtools::scanBam(file =bam.file , param = params)    

Thank you.

 

bam scanbamparam • 766 views
ADD COMMENTlink modified 3.7 years ago by Martin Morgan ♦♦ 23k • written 3.7 years ago by K50

I would like to add that I am dealing with non-human genomes. So I don't have an IRanges to supply in the "which" condition as mentioned in here: filterBam using rname 

ADD REPLYlink modified 3.7 years ago • written 3.7 years ago by K50
Answer: Read in specific rows of a BAM file that match a certain RNAME
2
gravatar for Martin Morgan
3.7 years ago by
Martin Morgan ♦♦ 23k
United States
Martin Morgan ♦♦ 23k wrote:

For a sample file I ran

library(Rsamtools)
example(scanBam)
bfl = BamFile(fl)

Grab the seqinfo, subset to the sequence(s) that you're interested in, and coerce to a GRanges

which <- as(seqinfo(bfl)["seq2"], "GRanges")

Then read the relevant reads

scanBam(bfl, param=ScanBamParam(which=which)

Usually it's more convenient to read the data at a slightly higher level, via one of the GenomicAlignments::readGAlignment*() functions. 

The error you see is because of the way you're trying to construct a GRanges, which might instead be

which <- GRanges("ref2", IRanges(1, 162114))

and then as above. Usually when you see an error it's helpful to make the function you're invoking less 'nested', to figure out which part is causing problems. Using this strategy might have lead to

> GRanges(ref,seqlengths = 162114)
Error in asMethod(object) : 
  The character vector to convert to a GRanges object must contain
  strings of the form "chr1:2501-2800" or "chr1:2501-2800:+" (".." being
  also supported as a separator between the start and end positions).
  Strand can be "+", "-", "*", or missing.

So that you'd know where to look.

 

ADD COMMENTlink written 3.7 years ago by Martin Morgan ♦♦ 23k

Thank you for the response.

Coercing the information into GRanges worked.

which <- as(seqinfo(bfl)["seq2"], "GRanges")
ADD REPLYlink written 3.7 years ago by K50
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 189 users visited in the last hour