How to randomly sample BAM files
1
1
Entering edit mode
@wdesouza
Last seen 3.4 years ago
Brazil

Is it possible to randomly sample BAM files using Rsamtools or GenomicAlignments packages? I would like to write a code similar to sample FASTQ files (using ShortRead):

fl <- "path_to_file.fastq"
f <- FastqSampler(fl, n=1e6)
chunk <- yield(f)
close(f)

Thank you.

rsamtools genomicalignments genomicfiles • 2.5k views
ADD COMMENT
1
Entering edit mode
@martin-morgan-1513
Last seen 3 days ago
United States

See GenomicFiles::reduceByYield and REDUCEsampler, including the example at the end of the help page

fl <- system.file(package="Rsamtools", "extdata", "ex1.bam")
bf <- BamFile(fl, yieldSize=1000)
     
yield <- function(x)
    readGAlignments(x,
        param=ScanBamParam(what=c( "qwidth", "mapq" )))
map <- identity
     
## Samples records from successive chunks of aligned reads.
reduceByYield(bf, yield, map, REDUCEsampler(1000, TRUE))

(Normally the first argument to REDUCEsampler() might be, e.g., 1000000)

 

ADD COMMENT
0
Entering edit mode

Thank you Martin, it works.

ADD REPLY

Login before adding your answer.

Traffic: 671 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6