Question: Calculate Outer Positions of GappedAlignmentPairs
0
gravatar for Dario Strbenac
6.6 years ago by
Dario Strbenac1.5k
Australia
Dario Strbenac1.5k wrote:
Hello, I would like to calculate the furthest outer positions of the mapped read pairs. That is, the lowest and highest mapped coordinate of a read. In a later step, I want to get a coverage estimate that also includes coverage of the spliced out intron parts. I couldn't find any function already in GenomicRanges. I made my own code to create a GRanges object of these simple ranges, but was curious about any API functionality I overlooked that could do this. Thanks, Dario. -------------------------------------- Dario Strbenac PhD Student University of Sydney Camperdown NSW 2050 Australia
coverage genomicranges • 641 views
ADD COMMENTlink modified 6.6 years ago by Michael Lawrence11k • written 6.6 years ago by Dario Strbenac1.5k
Answer: Calculate Outer Positions of GappedAlignmentPairs
0
gravatar for Michael Lawrence
6.6 years ago by
United States
Michael Lawrence11k wrote:
It would make sense for this to be granges,GappedAlignmentPairs. Want to contribute it? Michael On Tue, Mar 19, 2013 at 5:59 PM, Dario Strbenac <d.strbenac@garvan.org.au>wrote: > Hello, > > I would like to calculate the furthest outer positions of the mapped read > pairs. That is, the lowest and highest mapped coordinate of a read. In a > later step, I want to get a coverage estimate that also includes coverage > of the spliced out intron parts. I couldn't find any function already in > GenomicRanges. I made my own code to create a GRanges object of these > simple ranges, but was curious about any API functionality I overlooked > that could do this. > > Thanks, > Dario. > > -------------------------------------- > Dario Strbenac > PhD Student > University of Sydney > Camperdown NSW 2050 > Australia > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENTlink written 6.6 years ago by Michael Lawrence11k
Hello, There are a couple of existing methods of GappedAlignmentPairs that may be helpful. Using 'galp' from the man page as an example, example(GappedAlignmentPairs) > class(galp) [1] "GappedAlignmentPairs" attr(,"package") [1] "GenomicRanges" > length(galp) [1] 1572 To get your furthest outer positions, pull out the start of the left-most pair and the end of the right-most pair: outer <- cbind(start(left(galp)), end(right(galp))) > head(outer) [,1] [,2] [1,] 36 219 [2,] 49 259 [3,] 51 262 [4,] 60 259 [5,] 60 269 [6,] 61 282 For coverage, the left and right pair can be extracted and coerced to a GRAnges which retains introns in each pair as part of the range. grleft <- granges(left(galp)) grright <- granges(right(galp)) If we implemented a granges,GappedAlignmentPairs it should return something of the same length. To do this we'd have to merge the pair ranges which would then include the space between the ranges which I don't think we want. Right? As an fyi, another potentially useful function is introns(). This returns a GRangesList of all intron ranges for each pair. introns <- introns(galp) Valerie On 03/19/2013 09:44 PM, Michael Lawrence wrote: > It would make sense for this to be granges,GappedAlignmentPairs. Want to > contribute it? > > Michael > > > On Tue, Mar 19, 2013 at 5:59 PM, Dario Strbenac <d.strbenac at="" garvan.org.au="">wrote: > >> Hello, >> >> I would like to calculate the furthest outer positions of the mapped read >> pairs. That is, the lowest and highest mapped coordinate of a read. In a >> later step, I want to get a coverage estimate that also includes coverage >> of the spliced out intron parts. I couldn't find any function already in >> GenomicRanges. I made my own code to create a GRanges object of these >> simple ranges, but was curious about any API functionality I overlooked >> that could do this. >> >> Thanks, >> Dario. >> >> -------------------------------------- >> Dario Strbenac >> PhD Student >> University of Sydney >> Camperdown NSW 2050 >> Australia >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLYlink written 6.6 years ago by Valerie Obenchain6.7k
I think that Dario was in fact aiming for a single range that spanned the entire alignment, including the gap between the ends, so a granges() method would make sense here. Looks like you're most of the way towards implementing it ;) Michael On Wed, Mar 20, 2013 at 10:05 AM, Valerie Obenchain <vobencha@fhcrc.org>wrote: > Hello, > > There are a couple of existing methods of GappedAlignmentPairs that may be > helpful. > > Using 'galp' from the man page as an example, > > example(GappedAlignmentPairs) > > class(galp) > [1] "GappedAlignmentPairs" > attr(,"package") > [1] "GenomicRanges" > > length(galp) > [1] 1572 > > To get your furthest outer positions, pull out the start of the left-most > pair and the end of the right-most pair: > > outer <- cbind(start(left(galp)), end(right(galp))) > > head(outer) > [,1] [,2] > [1,] 36 219 > [2,] 49 259 > [3,] 51 262 > [4,] 60 259 > [5,] 60 269 > [6,] 61 282 > > For coverage, the left and right pair can be extracted and coerced to a > GRAnges which retains introns in each pair as part of the range. > > grleft <- granges(left(galp)) > grright <- granges(right(galp)) > > If we implemented a granges,GappedAlignmentPairs it should return > something of the same length. To do this we'd have to merge the pair ranges > which would then include the space between the ranges which I don't think > we want. Right? > > As an fyi, another potentially useful function is introns(). This returns > a GRangesList of all intron ranges for each pair. > > introns <- introns(galp) > > > Valerie > > > On 03/19/2013 09:44 PM, Michael Lawrence wrote: > >> It would make sense for this to be granges,GappedAlignmentPairs. Want to >> contribute it? >> >> Michael >> >> >> On Tue, Mar 19, 2013 at 5:59 PM, Dario Strbenac <d.strbenac@garvan.org.au>> >**wrote: >> >> Hello, >>> >>> I would like to calculate the furthest outer positions of the mapped read >>> pairs. That is, the lowest and highest mapped coordinate of a read. In a >>> later step, I want to get a coverage estimate that also includes coverage >>> of the spliced out intron parts. I couldn't find any function already in >>> GenomicRanges. I made my own code to create a GRanges object of these >>> simple ranges, but was curious about any API functionality I overlooked >>> that could do this. >>> >>> Thanks, >>> Dario. >>> >>> ------------------------------**-------- >>> Dario Strbenac >>> PhD Student >>> University of Sydney >>> Camperdown NSW 2050 >>> Australia >>> ______________________________**_________________ >>> Bioconductor mailing list >>> Bioconductor@r-project.org >>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.="" ethz.ch="" mailman="" listinfo="" bioconductor=""> >>> Search the archives: >>> http://news.gmane.org/gmane.**science.biology.informatics.**conduc tor<http: news.gmane.org="" gmane.science.biology.informatics.conductor=""> >>> >>> >> [[alternative HTML version deleted]] >> >> >> ______________________________**_________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.e="" thz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: http://news.gmane.org/gmane.** >> science.biology.informatics.**conductor<http: news.gmane.org="" gmane="" .science.biology.informatics.conductor=""> >> >> [[alternative HTML version deleted]]
ADD REPLYlink written 6.6 years ago by Michael Lawrence11k
Hi, Right, forgot to provide a granges() method for GappedAlignmentPairs. Makes sense to have one. Will do something like: rg <- range(grglist(x)) if (!all(elementLengths(rg) == 1L)) stop("For some pairs in 'x', the first and last alignments ", "are not aligned to the same chromosome and strand. " "Cannot extract a single range for them.") ans <- unlist(rg) ans Will add this to GenomicRanges ASAP. Note that, except for the error message (which is specific to paired- end reads), the code above implements a kind of standard way to go from a GRangesList object ('rg' in the code above) to a GRanges object ('ans') by merging all the ranges within each top-level element of the GRangesList into a single one. There is actually the same relationship between the GRangesList and GRanges objects obtained with grglist(x) and granges(x) on a GappedAlignments object. And we should have this relationship again when 'x' is a GAlignmentsList. So I wonder if we shouldn't make this transformation available as the coercion method from GRangesList to GRanges (there is no such coercion at the moment), and, in the man pages for grglist() and granges(), explain that granges(x) is conceptually equivalent to as(grglist(x), "GRanges"). Cheers, H. On 03/20/2013 10:52 AM, Michael Lawrence wrote: > I think that Dario was in fact aiming for a single range that spanned the > entire alignment, including the gap between the ends, so a granges() method > would make sense here. Looks like you're most of the way towards > implementing it ;) > > Michael > > > > On Wed, Mar 20, 2013 at 10:05 AM, Valerie Obenchain <vobencha at="" fhcrc.org="">wrote: > >> Hello, >> >> There are a couple of existing methods of GappedAlignmentPairs that may be >> helpful. >> >> Using 'galp' from the man page as an example, >> >> example(GappedAlignmentPairs) >>> class(galp) >> [1] "GappedAlignmentPairs" >> attr(,"package") >> [1] "GenomicRanges" >>> length(galp) >> [1] 1572 >> >> To get your furthest outer positions, pull out the start of the left-most >> pair and the end of the right-most pair: >> >> outer <- cbind(start(left(galp)), end(right(galp))) >>> head(outer) >> [,1] [,2] >> [1,] 36 219 >> [2,] 49 259 >> [3,] 51 262 >> [4,] 60 259 >> [5,] 60 269 >> [6,] 61 282 >> >> For coverage, the left and right pair can be extracted and coerced to a >> GRAnges which retains introns in each pair as part of the range. >> >> grleft <- granges(left(galp)) >> grright <- granges(right(galp)) >> >> If we implemented a granges,GappedAlignmentPairs it should return >> something of the same length. To do this we'd have to merge the pair ranges >> which would then include the space between the ranges which I don't think >> we want. Right? >> >> As an fyi, another potentially useful function is introns(). This returns >> a GRangesList of all intron ranges for each pair. >> >> introns <- introns(galp) >> >> >> Valerie >> >> >> On 03/19/2013 09:44 PM, Michael Lawrence wrote: >> >>> It would make sense for this to be granges,GappedAlignmentPairs. Want to >>> contribute it? >>> >>> Michael >>> >>> >>> On Tue, Mar 19, 2013 at 5:59 PM, Dario Strbenac <d.strbenac at="" garvan.org.au="">>>> **wrote: >>> >>> Hello, >>>> >>>> I would like to calculate the furthest outer positions of the mapped read >>>> pairs. That is, the lowest and highest mapped coordinate of a read. In a >>>> later step, I want to get a coverage estimate that also includes coverage >>>> of the spliced out intron parts. I couldn't find any function already in >>>> GenomicRanges. I made my own code to create a GRanges object of these >>>> simple ranges, but was curious about any API functionality I overlooked >>>> that could do this. >>>> >>>> Thanks, >>>> Dario. >>>> >>>> ------------------------------**-------- >>>> Dario Strbenac >>>> PhD Student >>>> University of Sydney >>>> Camperdown NSW 2050 >>>> Australia >>>> ______________________________**_________________ >>>> Bioconductor mailing list >>>> Bioconductor at r-project.org >>>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat="" .ethz.ch="" mailman="" listinfo="" bioconductor=""> >>>> Search the archives: >>>> http://news.gmane.org/gmane.**science.biology.informatics.**condu ctor<http: news.gmane.org="" gmane.science.biology.informatics.conductor=""> >>>> >>>> >>> [[alternative HTML version deleted]] >>> >>> >>> ______________________________**_________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.="" ethz.ch="" mailman="" listinfo="" bioconductor=""> >>> Search the archives: http://news.gmane.org/gmane.** >>> science.biology.informatics.**conductor<http: news.gmane.org="" gman="" e.science.biology.informatics.conductor=""> >>> >>> > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319
ADD REPLYlink written 6.6 years ago by Hervé Pagès ♦♦ 14k
On 03/20/2013 11:32 AM, Hervé Pagès wrote: > Hi, > > Right, forgot to provide a granges() method for GappedAlignmentPairs. > Makes sense to have one. Will do something like: > > rg <- range(grglist(x)) > if (!all(elementLengths(rg) == 1L)) > stop("For some pairs in 'x', the first and last alignments ", > "are not aligned to the same chromosome and strand. " > "Cannot extract a single range for them.") > ans <- unlist(rg) > ans > > Will add this to GenomicRanges ASAP. Added to GenomicRanges 1.11.39 (devel). Cheers, H. > > Note that, except for the error message (which is specific to paired-end > reads), the code above implements a kind of standard way to go from a > GRangesList object ('rg' in the code above) to a GRanges object ('ans') > by merging all the ranges within each top-level element of the > GRangesList into a single one. There is actually the same relationship > between the GRangesList and GRanges objects obtained with grglist(x) > and granges(x) on a GappedAlignments object. And we should have this > relationship again when 'x' is a GAlignmentsList. So I wonder if we > shouldn't make this transformation available as the coercion method > from GRangesList to GRanges (there is no such coercion at the moment), > and, in the man pages for grglist() and granges(), explain that > granges(x) is conceptually equivalent to as(grglist(x), "GRanges"). > > Cheers, > H. > > > On 03/20/2013 10:52 AM, Michael Lawrence wrote: >> I think that Dario was in fact aiming for a single range that spanned the >> entire alignment, including the gap between the ends, so a granges() >> method >> would make sense here. Looks like you're most of the way towards >> implementing it ;) >> >> Michael >> >> >> >> On Wed, Mar 20, 2013 at 10:05 AM, Valerie Obenchain >> <vobencha at="" fhcrc.org="">wrote: >> >>> Hello, >>> >>> There are a couple of existing methods of GappedAlignmentPairs that >>> may be >>> helpful. >>> >>> Using 'galp' from the man page as an example, >>> >>> example(GappedAlignmentPairs) >>>> class(galp) >>> [1] "GappedAlignmentPairs" >>> attr(,"package") >>> [1] "GenomicRanges" >>>> length(galp) >>> [1] 1572 >>> >>> To get your furthest outer positions, pull out the start of the >>> left-most >>> pair and the end of the right-most pair: >>> >>> outer <- cbind(start(left(galp)), end(right(galp))) >>>> head(outer) >>> [,1] [,2] >>> [1,] 36 219 >>> [2,] 49 259 >>> [3,] 51 262 >>> [4,] 60 259 >>> [5,] 60 269 >>> [6,] 61 282 >>> >>> For coverage, the left and right pair can be extracted and coerced to a >>> GRAnges which retains introns in each pair as part of the range. >>> >>> grleft <- granges(left(galp)) >>> grright <- granges(right(galp)) >>> >>> If we implemented a granges,GappedAlignmentPairs it should return >>> something of the same length. To do this we'd have to merge the pair >>> ranges >>> which would then include the space between the ranges which I don't >>> think >>> we want. Right? >>> >>> As an fyi, another potentially useful function is introns(). This >>> returns >>> a GRangesList of all intron ranges for each pair. >>> >>> introns <- introns(galp) >>> >>> >>> Valerie >>> >>> >>> On 03/19/2013 09:44 PM, Michael Lawrence wrote: >>> >>>> It would make sense for this to be granges,GappedAlignmentPairs. >>>> Want to >>>> contribute it? >>>> >>>> Michael >>>> >>>> >>>> On Tue, Mar 19, 2013 at 5:59 PM, Dario Strbenac >>>> <d.strbenac at="" garvan.org.au="">>>>> **wrote: >>>> >>>> Hello, >>>>> >>>>> I would like to calculate the furthest outer positions of the >>>>> mapped read >>>>> pairs. That is, the lowest and highest mapped coordinate of a read. >>>>> In a >>>>> later step, I want to get a coverage estimate that also includes >>>>> coverage >>>>> of the spliced out intron parts. I couldn't find any function >>>>> already in >>>>> GenomicRanges. I made my own code to create a GRanges object of these >>>>> simple ranges, but was curious about any API functionality I >>>>> overlooked >>>>> that could do this. >>>>> >>>>> Thanks, >>>>> Dario. >>>>> >>>>> ------------------------------**-------- >>>>> Dario Strbenac >>>>> PhD Student >>>>> University of Sydney >>>>> Camperdown NSW 2050 >>>>> Australia >>>>> ______________________________**_________________ >>>>> Bioconductor mailing list >>>>> Bioconductor at r-project.org >>>>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: sta="" t.ethz.ch="" mailman="" listinfo="" bioconductor=""> >>>>> >>>>> Search the archives: >>>>> http://news.gmane.org/gmane.**science.biology.informatics.**cond uctor<http: news.gmane.org="" gmane.science.biology.informatics.conducto="" r=""> >>>>> >>>>> >>>>> >>>> [[alternative HTML version deleted]] >>>> >>>> >>>> ______________________________**_________________ >>>> Bioconductor mailing list >>>> Bioconductor at r-project.org >>>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat="" .ethz.ch="" mailman="" listinfo="" bioconductor=""> >>>> >>>> Search the archives: http://news.gmane.org/gmane.** >>>> science.biology.informatics.**conductor<http: news.gmane.org="" gma="" ne.science.biology.informatics.conductor=""> >>>> >>>> >>>> >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319
ADD REPLYlink written 6.6 years ago by Hervé Pagès ♦♦ 14k
On 03/20/2013 10:05 AM, Valerie Obenchain wrote: > Hello, > > There are a couple of existing methods of GappedAlignmentPairs that may > be helpful. > > Using 'galp' from the man page as an example, > > example(GappedAlignmentPairs) > > class(galp) > [1] "GappedAlignmentPairs" > attr(,"package") > [1] "GenomicRanges" > > length(galp) > [1] 1572 > > To get your furthest outer positions, pull out the start of the > left-most pair and the end of the right-most pair: > > outer <- cbind(start(left(galp)), end(right(galp))) > > head(outer) > [,1] [,2] > [1,] 36 219 > [2,] 49 259 > [3,] 51 262 > [4,] 60 259 > [5,] 60 269 > [6,] 61 282 > > For coverage, the left and right pair can be extracted and coerced to a > GRAnges which retains introns in each pair as part of the range. > > grleft <- granges(left(galp)) > grright <- granges(right(galp)) > > If we implemented a granges,GappedAlignmentPairs it should return > something of the same length. To do this we'd have to merge the pair > ranges which would then include the space between the ranges which I > don't think we want. Right? This is a good point. The new "granges" method for GappedAlignmentPairs doesn't make any distinction between gaps (i.e. N's in the CIGAR's) and the space between the first and last alignments of a pair. All those gaps/spaces are "filled" to produce a single range per pair. If this is what the user wants to do, then fine, now s/he has an easy way to do it (with granges()). Whether it makes sense or not to call coverage() on that thing is another story and entirely up to the user. However note that this is NOT what Samtool's pileup (or whatever it's called those days) does. Last time I checked (was a long time ago, could have changed), you would need to do coverage(grglist(x)) on GappedAlignments object 'x' to obtain something that is consistent with Samtool's pileup. That means that gaps (i.e. N's in the CIGAR's) do NOT generate coverage (but D's do!). I'm pretty confident that the space between the first and last alignments of a pair doesn't generate coverage either. But that doesn't mean there are no situations where doing coverage(granges(x)) on GappedAlignmentPairs object 'x' doesn't make sense. Maybe there are. It's just not the usual way of doing things and one needs to be aware of it and the implications that this will have on the downstream analysis. Cheers, H. > > As an fyi, another potentially useful function is introns(). This > returns a GRangesList of all intron ranges for each pair. > > introns <- introns(galp) > > > Valerie > > On 03/19/2013 09:44 PM, Michael Lawrence wrote: >> It would make sense for this to be granges,GappedAlignmentPairs. Want to >> contribute it? >> >> Michael >> >> >> On Tue, Mar 19, 2013 at 5:59 PM, Dario Strbenac >> <d.strbenac at="" garvan.org.au="">wrote: >> >>> Hello, >>> >>> I would like to calculate the furthest outer positions of the mapped >>> read >>> pairs. That is, the lowest and highest mapped coordinate of a read. In a >>> later step, I want to get a coverage estimate that also includes >>> coverage >>> of the spliced out intron parts. I couldn't find any function already in >>> GenomicRanges. I made my own code to create a GRanges object of these >>> simple ranges, but was curious about any API functionality I overlooked >>> that could do this. >>> >>> Thanks, >>> Dario. >>> >>> -------------------------------------- >>> Dario Strbenac >>> PhD Student >>> University of Sydney >>> Camperdown NSW 2050 >>> Australia >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319
ADD REPLYlink written 6.6 years ago by Hervé Pagès ♦♦ 14k
I've heard of algorithms that make use of the "read bounds" coverage, so this should be useful for some. Thanks a lot for adding it so quickly! Michael On Wed, Mar 20, 2013 at 12:34 PM, Hervé Pagès <hpages@fhcrc.org> wrote: > > > On 03/20/2013 10:05 AM, Valerie Obenchain wrote: > >> Hello, >> >> There are a couple of existing methods of GappedAlignmentPairs that may >> be helpful. >> >> Using 'galp' from the man page as an example, >> >> example(GappedAlignmentPairs) >> > class(galp) >> [1] "GappedAlignmentPairs" >> attr(,"package") >> [1] "GenomicRanges" >> > length(galp) >> [1] 1572 >> >> To get your furthest outer positions, pull out the start of the >> left-most pair and the end of the right-most pair: >> >> outer <- cbind(start(left(galp)), end(right(galp))) >> > head(outer) >> [,1] [,2] >> [1,] 36 219 >> [2,] 49 259 >> [3,] 51 262 >> [4,] 60 259 >> [5,] 60 269 >> [6,] 61 282 >> >> For coverage, the left and right pair can be extracted and coerced to a >> GRAnges which retains introns in each pair as part of the range. >> >> grleft <- granges(left(galp)) >> grright <- granges(right(galp)) >> >> If we implemented a granges,GappedAlignmentPairs it should return >> something of the same length. To do this we'd have to merge the pair >> ranges which would then include the space between the ranges which I >> don't think we want. Right? >> > > This is a good point. The new "granges" method for GappedAlignmentPairs > doesn't make any distinction between gaps (i.e. N's in the CIGAR's) and > the space between the first and last alignments of a pair. All those > gaps/spaces are "filled" to produce a single range per pair. > > If this is what the user wants to do, then fine, now s/he has an easy > way to do it (with granges()). Whether it makes sense or not to call > coverage() on that thing is another story and entirely up to the user. > > However note that this is NOT what Samtool's pileup (or whatever it's > called those days) does. Last time I checked (was a long time ago, > could have changed), you would need to do coverage(grglist(x)) on > GappedAlignments object 'x' to obtain something that is consistent > with Samtool's pileup. That means that gaps (i.e. N's in the CIGAR's) > do NOT generate coverage (but D's do!). I'm pretty confident that > the space between the first and last alignments of a pair doesn't > generate coverage either. > > But that doesn't mean there are no situations where doing > coverage(granges(x)) on GappedAlignmentPairs object 'x' doesn't make > sense. Maybe there are. It's just not the usual way of doing things > and one needs to be aware of it and the implications that this will > have on the downstream analysis. > > Cheers, > H. > > > >> As an fyi, another potentially useful function is introns(). This >> returns a GRangesList of all intron ranges for each pair. >> >> introns <- introns(galp) >> >> >> Valerie >> >> On 03/19/2013 09:44 PM, Michael Lawrence wrote: >> >>> It would make sense for this to be granges,GappedAlignmentPairs. Want to >>> contribute it? >>> >>> Michael >>> >>> >>> On Tue, Mar 19, 2013 at 5:59 PM, Dario Strbenac >>> <d.strbenac@garvan.org.au>**wrote: >>> >>> Hello, >>>> >>>> I would like to calculate the furthest outer positions of the mapped >>>> read >>>> pairs. That is, the lowest and highest mapped coordinate of a read. In a >>>> later step, I want to get a coverage estimate that also includes >>>> coverage >>>> of the spliced out intron parts. I couldn't find any function already in >>>> GenomicRanges. I made my own code to create a GRanges object of these >>>> simple ranges, but was curious about any API functionality I overlooked >>>> that could do this. >>>> >>>> Thanks, >>>> Dario. >>>> >>>> ------------------------------**-------- >>>> Dario Strbenac >>>> PhD Student >>>> University of Sydney >>>> Camperdown NSW 2050 >>>> Australia >>>> ______________________________**_________________ >>>> Bioconductor mailing list >>>> Bioconductor@r-project.org >>>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat="" .ethz.ch="" mailman="" listinfo="" bioconductor=""> >>>> Search the archives: >>>> http://news.gmane.org/gmane.**science.biology.informatics.**condu ctor<http: news.gmane.org="" gmane.science.biology.informatics.conductor=""> >>>> >>>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________**_________________ >>> Bioconductor mailing list >>> Bioconductor@r-project.org >>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.="" ethz.ch="" mailman="" listinfo="" bioconductor=""> >>> Search the archives: >>> http://news.gmane.org/gmane.**science.biology.informatics.**conduc tor<http: news.gmane.org="" gmane.science.biology.informatics.conductor=""> >>> >>> >> ______________________________**_________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.e="" thz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: >> http://news.gmane.org/gmane.**science.biology.informatics.**conduct or<http: news.gmane.org="" gmane.science.biology.informatics.conductor=""> >> > > -- > Hervé Pagès > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: hpages@fhcrc.org > Phone: (206) 667-5791 > Fax: (206) 667-1319 > [[alternative HTML version deleted]]
ADD REPLYlink written 6.6 years ago by Michael Lawrence11k
Thanks for the addition ! I was about to take up the challenge to create and submit an implementation.
ADD REPLYlink written 6.6 years ago by Dario Strbenac1.5k
Counting the space between the two reads is certainly used in some algorithms. But we should have the capability to not do so. Also, in both cases, we should be able to not double count if a base is covered by both reads, in case they overlap. Kasper On Wed, Mar 20, 2013 at 7:00 PM, Dario Strbenac <d.strbenac at="" garvan.org.au=""> wrote: > Thanks for the addition ! I was about to take up the challenge to create and submit an implementation. > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLYlink written 6.6 years ago by Kasper Daniel Hansen6.4k
On Wed, Mar 20, 2013 at 6:57 PM, Kasper Daniel Hansen < kasperdanielhansen@gmail.com> wrote: > Counting the space between the two reads is certainly used in some > algorithms. > > But we should have the capability to not do so. > > This is already supported to some extent. grglist() will exclude the space and any N gaps, optionally D. > Also, in both cases, we should be able to not double count if a base > is covered by both reads, in case they overlap. > > This comes about by calling reduce() on grglist(). > Kasper > > On Wed, Mar 20, 2013 at 7:00 PM, Dario Strbenac > <d.strbenac@garvan.org.au> wrote: > > Thanks for the addition ! I was about to take up the challenge to create > and submit an implementation. > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@r-project.org > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD REPLYlink written 6.6 years ago by Michael Lawrence11k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 314 users visited in the last hour