Question: Is it possible to plot a track for each read group in a bam file with gviz without splitting the bam file into saparate bams?
gravatar for Pablo marin-garcia
4.1 years ago by
United Kingdom
Pablo marin-garcia80 wrote:

I have a Bam file with several individuals (read groups) and I would like to plot each of them in a different track with gviz. If I split the bam file and load all individual bam files I can do the task. But I wonder if is there any option to say to gviz to split the tracks by readgroup.

If I plot my multisample bam like this

bamTracks <- DataTrack(range = bamFile
                       , genome = genome
                       , type = c("l","g")
                       , name = "Coverage"
                       , chromosome = chr
                       , from = start
                       , to = end
                       , ylim = c(0,240)
                       , background.panel = "#FFFFEB"

I obtain only one track with the coverage sum of all individuals. Is there an option in the importFunction() in the DataTrack class to split by the @RG when importing the bam?

Would it be better to read a bam with rsamtools or similar module, split them and convert to granges to load in gviz?

ADD COMMENTlink modified 4.1 years ago by Martin Morgan ♦♦ 22k • written 4.1 years ago by Pablo marin-garcia80
gravatar for
4.1 years ago by
Hi Pablo, Splitting BAM files based on read groups is currently not supported by Gviz. You could write your own importFunction to do this, though. Starting from the default Gviz:::.import.bam this shouldn't be too hard. Essentially what you want to do is something like this:
myImportBam <- function (file, selection) {
    if (!file.exists(paste(file, "bai", sep = ".")) && !file.exists(paste(paste(head(strsplit("xxx.bam", ".", fixed = TRUE)[[1]], -1), collapse = "."), "bai", sep = ".")))
        stop("Unable to find index for BAM file '", file, "'. You can build an index using the following command:\n\t",
             "library(Rsamtools)\n\tindexBam(\"", file, "\")")
    sinfo <- scanBamHeader(file)[[1]]
    res <- if (!as.character(seqnames(selection)[1]) %in% names(sinfo$targets)) {
        mcols(selection) <- DataFrame(score = 0)
    } else {
        param <- ScanBamParam(what = scanBamWhat(),
                              which = selection, flag = scanBamFlag(isUnmappedQuery = FALSE))
        x <- scanBam(file, param = param)[[1]]

        ## Insert here: logic to split up x based on the read group and compute cov at fixed locations.
        ##              create GRanges object from this with ranges locations equal to the positions
        ##              where the coverage has been computed and as many additional elementMetadata vectors
        ##              as there are read groups.

        ##cov <- coverage(IRanges(x[["pos"]], width = x[["qwidth"]]))
        ##if (length(cov) == 0) {
        ##    mcols(selection) <- DataFrame(score = 0)
        ##    selection
        ##else {
        ##    GRanges(seqnames = seqnames(selection), ranges = IRanges(start = start(cov),
        ##      end = end(cov)), strand = "*", score = runValue(cov))


Please note that importing aligned reads from a BAM file as a regular DataTrack is rather crude, and you may want to investigate the more powerful AligenmentsTrack class. Florian
ADD COMMENTlink modified 4.1 years ago by Martin Morgan ♦♦ 22k • written 4.1 years ago by florian.hahne@novartis.com1.6k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 157 users visited in the last hour