It's been a while since I posted - hi all! I miss you bioc people here at the Hutch.
I am wondering whether a function exists that works like stackStringsFromBam, but operates on a GAlignments object (that contains a "seq" column) rather than a bam file. I'm asking because I want to do this series of operations (a) read in a bam file as GAlignments (has thousands of reads), e.g. aln <- readGAlignments(bamfile, use.names=TRUE, param=ScanBamParam(what="seq")) (b) do some filtering on those alignments (now have fewer thousands of reads) aln_filt <- aln[myFilters] (c) get an alignment of reads overlapping a region of interest (~10 reads in each region)
If I want to skip the filtering step, I can get the alignment like this:
myAln <- stackStringsFromBam(bamfile, param=myRegionAsGRanges, use.names=TRUE) but that operates on a bam rather than GAlignments object.
It'd be really handy for me there was an analagous function like stackStringsFromGAlignments - would that be easy to implement? (or does it already exist?)
I think I can see a path towards getting what I need using the sequenceLayer function but that path seems more complicated than it should be. I can see other potential paths too that will extract sequence chunks given a reference sequence, but they lose the read names, which I want to use in later processing steps.
Am I missing something? It seems like there's probably a function out there that'll do what I want but I can't figure out what it is.
thanks very much!