Gviz: Filtering AlignmentsTrack by read names
1
0
Entering edit mode
anikajohn • 0
@e0dc22d0
Last seen 8 days ago
Switzerland

Hi,

I am new to Gviz and it's lots of fun. Right now I am looking into visualising read alignments. Currently, using the code below, all reads of the bam file are displayed. However, I'd like to only display the alignment of one specific read (it's supposed to become a functionality within an app). Is there a way to filter specific reads by their id when using AlignmentsTrack. Or alternatively, is there a possibility to use GAlignments object with AlignmentsTrack (I can filter these easily).

alTrack <- AlignmentsTrack(alignment.bam, showIndels=TRUE, name="Indels")

plotTracks(alTrack,
from = 1,
to = 12000,
chromosome = "ABC")


Gvi Gviz • 148 views
0
Entering edit mode

I have some better reproducible example here:

library(Gviz)
library(GenomicAlignments)

###### How it works right now #######

afrom <- 2960000
ato <- 3160000
alTrack <- AlignmentsTrack(
system.file(package = "Gviz", "extdata", "gapped.bam"),
isPaired = TRUE)

plotTracks(alTrack,from = afrom, to = ato,
chromosome = "chr12")

###### How I want it to work #######

#1. Filter the bam file, so that left with only one read
bamFile <- system.file(package = "Gviz", "extdata", "gapped.bam")
#give reads an example name, want to filter by it later

#2. display the alignment of only the one selected read (PSEUDO CODE)
alTrack <- AlignmentsTrack(
bamObject,
isPaired = TRUE)

plotTracks(alTrack,from = afrom, to = ato,
chromosome = "chr12")


Maybe that makes more obvious what I mean.

-Cheers

0
Entering edit mode
@james-w-macdonald-5106
Last seen 1 day ago
United States

This is explained in the help page for AlignmentsTracks. As per most R help, everything you need is there, but it was written by someone who knew exactly how the machine works, which is not always the best situation. Here's the start of the relevant section:

 range: An optional meta argument to handle the different input
types. If the 'range' argument is missing, all the relevant
information to create the object has to be provided as
individual function arguments (see below).

The different input options for 'range' are:

A 'character' string: the path to a 'BAM' file containing the read
alignments. To be precise, this will result in the
instantiation of a 'ReferenceAlignmentsTrack' object, but
for the user this implementation detail should be of no
concern.

A 'GRanges' object: the genomic ranges of the individual reads as well
'cigar', 'mapq', 'flag', 'isize', 'groupid', 'status',
'md' and 'seqs' (see description of the individual
function parameters below for details). Calling the
constructor on a 'GRanges' object without further
arguments, e.g. 'AlignmentsTrack(range=obj)' is
equivalent to calling the coerce method 'as(obj,
"AlignmentsTrack")'.


Where the super-relevant part is the explanation of the GRanges part. But how would one get such a thing? You already tried the obvious. But as you can see that won't work because it doesn't return a GRanges object. But you can read in everything you need using scanBam and make the GRanges yourself.

> bfl <- BamFile(bamFile, asMates = TRUE)

## scanBam returns a list for each range in 'which'. We only used one range, so just get that list item
> z <- scanBam(bfl, param = ScanBamParam(which = GRanges("chr12:2960000-3160000"), what = scanBamWhat(), flag = scanBamFlag(isUnmappedQuery = FALSE)))[[1]]

## Now turn that into a GRanges object
> gr <- GRanges(z$rname, IRanges(z$pos, width = z$qwidth), z$strand,
id = z$qname, cigar = z$cigar, mapq = z$mapq, flag = z$flag, isize = z$isize, groupid = z$groupid, status = z$mate_status) > gr GRanges object with 2244 ranges and 7 metadata columns: seqnames ranges strand | id cigar <Rle> <IRanges> <Rle> | <character> <character> [1] chr12 2968078-2968127 + | HWUSI-EAS1794_0001_F.. 50M [2] chr12 2968219-2968268 - | HWUSI-EAS1794_0001_F.. 50M [3] chr12 2986702-2986751 + | HWUSI-EAS1794_0001_F.. 50M [4] chr12 2986841-2986890 - | HWUSI-EAS1794_0001_F.. 50M [5] chr12 2967685-2967734 + | HWUSI-EAS1794_0001_F.. 50M ... ... ... ... . ... ... [2240] chr12 3152224-3152273 - | HWUSI-EAS1794_0001_F.. 50M [2241] chr12 2997921-2997970 - | HWUSI-EAS1794_0001_F.. 50M [2242] chr12 2973632-2973681 + | HWUSI-EAS1794_0001_F.. 30M187N20M [2243] chr12 2981406-2981455 + | HWUSI-EAS1794_0001_F.. 8M1729N42M [2244] chr12 2975582-2975631 + | HWUSI-EAS1794_0001_F.. 50M mapq flag isize groupid status <integer> <integer> <integer> <integer> <factor> [1] 255 99 0 1 mated [2] 255 147 0 1 mated [3] 255 99 0 2 mated [4] 255 147 0 2 mated [5] 255 99 0 3 mated ... ... ... ... ... ... [2240] 255 113 0 1166 unmated [2241] 255 177 0 1167 unmated [2242] 255 137 <NA> 1168 unmated [2243] 255 73 <NA> 1169 unmated [2244] 255 137 <NA> 1170 unmated ------- seqinfo: 1 sequence from an unspecified genome; no seqlengths  Now to plot only one read, you need to know which read you want to plot. I have no idea how you would know that, but here's one random read. alTrack2 <- AlignmentsTrack(gr[gr$id %in% gr\$id[400]])

## and now you can plot that one read by doing

plotTracks(alTrack2, from = afrom, to = ato, chromosome = "chr12")