Question: extract chromosomes and read-starts from a BAM file to a data.frame?
0
gravatar for Vang Le
3.1 years ago by
Vang Le70
Denmark
Vang Le70 wrote:

like the title says, I am looking for a concise and high-performance way to extract only the chromosome name and read start position from a BAM file. It can be easily done from outside R like this:

samtools view my.bam |cut -f 3,4

but I want to try it within R code. Calling command via `system` is OK.

 


 

ADD COMMENTlink modified 3.1 years ago by Martin Morgan ♦♦ 23k • written 3.1 years ago by Vang Le70
Answer: extract chromosomes and read-starts from a BAM file to a data.frame?
2
gravatar for Martin Morgan
3.1 years ago by
Martin Morgan ♦♦ 23k
United States
Martin Morgan ♦♦ 23k wrote:

Use Rsamtools, specifying a ScanBamParam() with just the information you'd like to extract. Coerce the result to a data.frame (it's just a list anyway, so this is inexpensive).

> library(Rsamtools)
> fl = system.file(package="Rsamtools", "extdata", "ex1.bam")
> p = ScanBamParam(what=c("rname", "pos"))
> head(as.data.frame(scanBam(fl, param=p)))
  rname pos
1  seq1   1
2  seq1   3
3  seq1   5
4  seq1   6
5  seq1   9
6  seq1  13

The help pages ?ScanBamParam and ?scanBam and the package vignette browseVignettes("Rsamtools") as well as the package GenomicAlignments are likely to be helpful.

ADD COMMENTlink modified 3.1 years ago • written 3.1 years ago by Martin Morgan ♦♦ 23k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 128 users visited in the last hour