Extract isoform-specific ranges
2
0
Entering edit mode
hwu12 ▴ 10
@hwu12-12353
Last seen 5 months ago
United States

Hi, is there a way to identify/extract the isoform-specific ranges for multiple isoforms? For example:

GeneA.1 with 1 ranges [1,10]

GeneA.2 with two ranges [1,3][8:10]

GeneA.3 with two ranges [1,3][7:15]

I want:

The range specific for GeneA.1 is [4:6]

The range specific for GeneA.2 is none

The range specific for GeneA.3 is [11,15]

Thanks!

bioconductor genomicranges • 1.3k views
2
Entering edit mode
Malcolm Cook ★ 1.5k
@malcolm-cook-6293
Last seen 7 hours ago
United States

Hello - I expect GenomicRanges:disjoin is your friend for this operation. Look:

# editted for tightness and clarity, and to add a final split

> g<-GRangesList(GeneA.1 = GRanges(seqnames="x", ranges=IRanges(1, 10), strand="+")
+               ,GeneA.2 =  GRanges(seqnames="x", ranges=IRanges(c(1, 8), c(3,10)), strand='+')
+               ,GeneA.3 = GRanges(seqnames="x", ranges=IRanges(c(1, 7), c(3,15)), strand='+'))
> u<-unlist(g)
> d <- disjoin(u, with.revmap=TRUE)
> d$source<-lapply(d$revmap,function(i)names(u)[i])
> d1<-d[1==lengths(d$revmap)] > s<-split(d1,unlist(d1$source))
> d1
GRanges object with 2 ranges and 2 metadata columns:
seqnames    ranges strand |        revmap  source
<Rle> <IRanges>  <Rle> | <IntegerList>  <list>
[1]        x  [ 4,  6]      + |             1 GeneA.1
[2]        x  [11, 15]      + |             5 GeneA.3
-------
seqinfo: 1 sequence from an unspecified genome; no seqlengths
> s
GRangesList object of length 2:
$GeneA.1 GRanges object with 1 range and 2 metadata columns: seqnames ranges strand | revmap source <Rle> <IRanges> <Rle> | <IntegerList> <list> [1] x [4, 6] + | 1 GeneA.1$GeneA.3
GRanges object with 1 range and 2 metadata columns:
seqnames   ranges strand | revmap  source
[1]        x [11, 15]      + |      5 GeneA.3

-------
seqinfo: 1 sequence from an unspecified genome; no seqlengths
> 

1
Entering edit mode

Could also do

subset(GRanges(coverage(u)), score==1L)

1
Entering edit mode

Yes, except that it does not provide the info present in the revmap, being the association of each range with the to which it is unique isoform which I am assuming (perhaps incorrectly?) is desired result

> subset(GRanges(coverage(u)), score==1L)
GRanges object with 2 ranges and 1 metadata column:
seqnames    ranges strand |     score
<Rle> <IRanges>  <Rle> | <integer>
[1]        x  [ 4,  6]      * |         1
[2]        x  [11, 15]      * |         1
-------
seqinfo: 1 sequence from an unspecified genome
0
Entering edit mode

Malcolm Cook, thank you so much for the complete example. Michael Lawrence, thanks for your help, too. I really appreciate it.

1
Entering edit mode

@hwu12 - Glad to help!  The best thanks would be to accept my answer!

0
Entering edit mode

Malcolm, how can I accept your answer? I want to do it but I cannot find a way to do it. It is not similar to StackOverflow. Sorry I did not see your reply until now. Please let me know and I will accept your answer. Thanks again!

0
Entering edit mode

To the left of every response to your question you should have the opportunity to click on a "thumb" to give the response +1 points and I think also a check-mark in a circle, which will turn green, which indicates that you have "accepted" the response as an answer.  I could not find documentation on this for you.  I wonder where it is.  Here is an example of a response of mine which was accepted as an answer: A: Importing a .txt file with multiple headers into R - you should be able to see the gree check mark

0
Entering edit mode

The problem is in my Safari browser. I cannot/never see the green circle there. I switched to Firefox and can see it now. Thanks again.

1
Entering edit mode

Thanks. I reported your experience as an issue here: checkbox to accept response as answer does not appear in Safari browser #439

1
Entering edit mode

Thanks! I uploaded an image to issue #439 from the two browsers to show the problem. Hopefully, they can fix it soon. In any case, I will use the Firefox to vote in the future.

0
Entering edit mode
@michael-lawrence-3846
Last seen 6 weeks ago
United States

Maybe you're looking for GenomicFeatures::mapToTranscripts()?

0
Entering edit mode

Thanks, Michael. I am actually looking for a way to isolate isoform-specific ranges (e.g. for the isoform 1 in the example, the bases from 4 to 6 are unique for the isoform 1 because it is spliced out in other 2 isoforms and does not exist in the final transcript. The [11:15] range for isoform 3 is unique, too. This range does not exist in other isoforms.).