removing reads from coverage() calculation.
1
1
Entering edit mode
gen ▴ 30
@gen-7383
Last seen 9.0 years ago
United States

Hi, say I'm working with some RNAseq reads

What I'm usually doing is taking coverage() and giving it a BAM file and a GRange for the param and I assume this will calculate coverage for the segment that I specify in the GRange. 

Would it be possible to get the reads for the segment with maybe scanBam even the ones that aren't completely inside the scanned segment. Take a look at their positions and remove/shift a few based on position. Then run coverage() based on the altered set of reads? Preferably without generating new large bam files ?

(although if this is the only way its acceptable although I'm pretty sure its not necessary)

 

Thanks.

bam rsamtools genomicranges • 1.1k views
ADD COMMENT
0
Entering edit mode
@martin-morgan-1513
Last seen 6 weeks ago
United States

Probably you want to input the data using GenomicAlignments::readGAlignments() (or perhaps readGAlignmentsList() or readGAlignmentPairs() for paired-end data). Use the param argument, ScanBamParam, and ScanBamParam's which argument to selectively input reads overlapping your particular regions of interest. Do your manipulations (including perhaps coercing to GRanges or GRangesList with granges() or grglist()). Then use ?"coverage,GAlignments-method" or similar to calculate coverage.

If you're working with a relatively small (e.g., 100's) of regions of interest including relatively small (e.g., millions) of reads, then probably you can read all the alignments with a single call to readGAlignments(). If you have many regions of interest or many millions of reads, then you'll likely want to iterate through the file or regions. One approach is use GenomicFiles::reduceByYield(). If you've written a function FUN that takes a GAlignments instance and returns a coverage vector (using the approach in the first paragraph), then you would write something like (psuedo code!)

bf = BamFile("your.bam", yieldSize=1000000)
reduceByYield(bf, readGAlignments, FUN)

There's a lot of scope and flexibility with this approach; feel free to ask further questions about individual steps.

ADD COMMENT

Login before adding your answer.

Traffic: 548 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6