Read in specific rows of a BAM file that match a certain RNAME
1
1
Entering edit mode
KB ▴ 50
@k-8495
Last seen 15 months ago
United States

Hello,

I am trying to read in a BAM file using RSamtools into R. I only want to read in those rows that match a specific "RNAME" (eg:  I only want to read rows where RNAME = "ref2")

I have looked through the vignette and online resources, and this is what I tried so far.

Obviously my "which" argument for ScanBamParam is incorrect. Could someone help me correct this query ? 

If there is no way to do this, is my only choice reading in all the rows in BAM file, and then sub-setting based on my RNAME match ?

inputBam = "test.bam"
bamFile <- BamFile(inputBam)
seqInfo1 <- seqinfo(bamFile)

seqInfo1
Seqinfo object with 2 sequences from an unspecified genome:
    seqnames seqlengths isCircular genome
    ref1     159322         NA   <NA>
    ref2     162114         NA   <NA>

ref= "ref2"    
# trying to read only rows in BAM file where RNAME = "ref2"
params <- ScanBamParam(which = GRanges(ref,seqlengths = 162114), what = c("qname", "flag","rname","pos","mapq","cigar"))

Error in ScanBamParam(which = GRanges(ref, seqlengths = 162114), what = c("qname",  : error in evaluating the argument 'which' in selecting a method for function 'ScanBamParam': Error in asMethod(object) :  The character vector to convert to a GRanges object must contain strings of the form "chr1:2501-2800" or  "chr1:2501-2800:+" (".." being also supported as a separator between the start and end positions). Strand can be  "+", "-", "*", or missing.

temp <- Rsamtools::scanBam(file =bam.file , param = params)    

Thank you.

 

scanbamparam bam • 1.6k views
ADD COMMENT
0
Entering edit mode

I would like to add that I am dealing with non-human genomes. So I don't have an IRanges to supply in the "which" condition as mentioned in here: filterBam using rname 

ADD REPLY
2
Entering edit mode
@martin-morgan-1513
Last seen 2 hours ago
United States

For a sample file I ran

library(Rsamtools)
example(scanBam)
bfl = BamFile(fl)

Grab the seqinfo, subset to the sequence(s) that you're interested in, and coerce to a GRanges

which <- as(seqinfo(bfl)["seq2"], "GRanges")

Then read the relevant reads

scanBam(bfl, param=ScanBamParam(which=which)

Usually it's more convenient to read the data at a slightly higher level, via one of the GenomicAlignments::readGAlignment*() functions. 

The error you see is because of the way you're trying to construct a GRanges, which might instead be

which <- GRanges("ref2", IRanges(1, 162114))

and then as above. Usually when you see an error it's helpful to make the function you're invoking less 'nested', to figure out which part is causing problems. Using this strategy might have lead to

> GRanges(ref,seqlengths = 162114)
Error in asMethod(object) : 
  The character vector to convert to a GRanges object must contain
  strings of the form "chr1:2501-2800" or "chr1:2501-2800:+" (".." being
  also supported as a separator between the start and end positions).
  Strand can be "+", "-", "*", or missing.

So that you'd know where to look.

 

ADD COMMENT
0
Entering edit mode

Thank you for the response.

Coercing the information into GRanges worked.

which <- as(seqinfo(bfl)["seq2"], "GRanges")
ADD REPLY

Login before adding your answer.

Traffic: 819 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6