Minimal region of overlap between two GRanges objects
1
1
Entering edit mode
Daniel Brewer ★ 1.9k
@daniel-brewer-1791
Last seen 10.4 years ago
Dear mailing list, I have two GRanges objects each containing segments of copy number change for a tumour. I would like to find regions where there is copy number change in both the tumours. I though findOverlaps was giving me what I wanted but soon found that findOverlaps(A,B) != findOverlaps(B,A). I am sure there must be a reasonable way to do this but I cannot find it. Thanks Dan -- ************************************************************** Daniel Brewer, Ph.D. Institute of Cancer Research ************************************************************** The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. This e-mail message is confidential and for use by the a...{{dropped:5}}
Cancer Cancer • 1.5k views
ADD COMMENT
0
Entering edit mode
@martin-morgan-1513
Last seen 20 hours ago
United States
On 12/13/2011 07:00 AM, Daniel Brewer wrote: > Dear mailing list, > > I have two GRanges objects each containing segments of copy number > change for a tumour. I would like to find regions where there is > copy number change in both the tumours. I though findOverlaps was > giving me what I wanted but soon found that findOverlaps(A,B) != > findOverlaps(B,A). I am sure there must be a reasonable way to do > this but I cannot find it. Hi Dan -- > a = GRanges("chrA", IRanges(11, 20), seqlengths=c(chrA=100)) > b = GRanges("chrA", IRanges(16, 25), seqlengths=c(chrA=100)) > intersect(a, b) GRanges with 1 range and 0 elementMetadata values: seqnames ranges strand <rle> <iranges> <rle> [1] chrA [16, 20] * --- seqlengths: chrA 100 but maybe a working example from your end would help clarify what you're after / where things go wrong for you with findOverlaps. Martin > > Thanks > > Dan > -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793
ADD COMMENT
0
Entering edit mode
Thanks Martin, Yes you're right a simple intersect works great. I had a suspicion I was missing something very obvious. Many thanks. Dan On 13 Dec 2011, at 4:49 PM, Martin Morgan wrote: > On 12/13/2011 07:00 AM, Daniel Brewer wrote: >> Dear mailing list, >> >> I have two GRanges objects each containing segments of copy number >> change for a tumour. I would like to find regions where there is >> copy number change in both the tumours. I though findOverlaps was >> giving me what I wanted but soon found that findOverlaps(A,B) != >> findOverlaps(B,A). I am sure there must be a reasonable way to do >> this but I cannot find it. > > Hi Dan -- > >> a = GRanges("chrA", IRanges(11, 20), seqlengths=c(chrA=100)) >> b = GRanges("chrA", IRanges(16, 25), seqlengths=c(chrA=100)) >> intersect(a, b) > GRanges with 1 range and 0 elementMetadata values: > seqnames ranges strand > <rle> <iranges> <rle> > [1] chrA [16, 20] * > --- > seqlengths: > chrA > 100 > > but maybe a working example from your end would help clarify what you're > after / where things go wrong for you with findOverlaps. > > Martin > >> >> Thanks >> >> Dan >> > > > -- > Computational Biology > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 > > Location: M1-B861 > Telephone: 206 667-2793 -- ************************************************************** Daniel Brewer, Ph.D. Institute of Cancer Research Molecular Carcinogenesis MUCRC 15 Cotswold Road Sutton, Surrey SM2 5NG United Kingdom Tel: +44 (0) 20 8722 4109 ************************************************************** The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. This e-mail message is confidential and for use by the a...{{dropped:5}}
ADD REPLY
0
Entering edit mode
Supposing I wanted to do this with several hundred tumors to look for hotspots in certain subtypes, is there a convenient way to do this internally with one or two great big GRanges? Or is this the sort of situation where I should use a genoset? Example CNV.GR is a GRanges for a tumor type, built from GISTIC CNV calls achieving regional statistical significance): > EZH2 <- getCNV('EZH2', CNV.GR) # subsetByOverlapsCNV.GR, getExons(gene, basesUp)) > head(EZH2) GRanges with 6 ranges and 1 elementMetadata value: seqnames ranges strand | score <rle> <iranges> <rle> | <numeric> 2915.Del.170 chr7 [140579242, 159118025] * | -1 2952.Del.93 chr7 [141787637, 157040032] * | -1 2929.Del.258 chr7 [141956277, 148251831] * | -1 2874.Del.44 chr7 [142484285, 159127005] * | -1 2857.Del.97 chr7 [142486080, 154885611] * | -1 2805.Del.78 chr7 [142486080, 154970916] * | -1 > sum(score(EZH2)) [1] -22 So exons of EZH2 are frequently deleted in the particular set of samples I'm looking at. Myc is amplified a great deal. etc. But I knew what I wanted to look for in those cases. Suppose I instead want to blob together and scan various tumors by common mutations, or known subtypes of driver mutations. It's handy for me to just slap a +1 or -1 in the 'score' column and add it up. It would be handier still for me to do running tallies of this... it seems like the sort of thing that people would have solved years ago, I just don't know precisely what to look for in terms of documentation for it. Any assistance is appreciated. --t On Tue, Dec 13, 2011 at 8:55 AM, Daniel Brewer <daniel.brewer@icr.ac.uk>wrote: > Thanks Martin, > > Yes you're right a simple intersect works great. I had a suspicion I was > missing something very obvious. Many thanks. > > Dan > > On 13 Dec 2011, at 4:49 PM, Martin Morgan wrote: > > > On 12/13/2011 07:00 AM, Daniel Brewer wrote: > >> Dear mailing list, > >> > >> I have two GRanges objects each containing segments of copy number > >> change for a tumour. I would like to find regions where there is > >> copy number change in both the tumours. I though findOverlaps was > >> giving me what I wanted but soon found that findOverlaps(A,B) != > >> findOverlaps(B,A). I am sure there must be a reasonable way to do > >> this but I cannot find it. > > > > Hi Dan -- > > > >> a = GRanges("chrA", IRanges(11, 20), seqlengths=c(chrA=100)) > >> b = GRanges("chrA", IRanges(16, 25), seqlengths=c(chrA=100)) > >> intersect(a, b) > > GRanges with 1 range and 0 elementMetadata values: > > seqnames ranges strand > > <rle> <iranges> <rle> > > [1] chrA [16, 20] * > > --- > > seqlengths: > > chrA > > 100 > > > > but maybe a working example from your end would help clarify what you're > > after / where things go wrong for you with findOverlaps. > > > > Martin > > > >> > >> Thanks > >> > >> Dan > >> > > > > > > -- > > Computational Biology > > Fred Hutchinson Cancer Research Center > > 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 > > > > Location: M1-B861 > > Telephone: 206 667-2793 > > -- > ************************************************************** > > Daniel Brewer, Ph.D. > > Institute of Cancer Research > Molecular Carcinogenesis > MUCRC > 15 Cotswold Road > Sutton, Surrey SM2 5NG > United Kingdom > > Tel: +44 (0) 20 8722 4109 > > ************************************************************** > > > > > The Institute of Cancer Research: Royal Cancer Hospital, a charitable > Company Limited by Guarantee, Registered in England under Company No. > 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. > > This e-mail message is confidential and for use by the...{{dropped:20}}
ADD REPLY
0
Entering edit mode
Two things (1) watch out that the return object from intersect is a normalized IRanges. So adjacent Ranges will be merged. Whether or not that is a problem depends on what representation you are using. (2) Tim: you could use coverage() and slice() on an object you get by c()'ing the individual level IRanges together. Start by ignoring the score column. Kasper On Tue, Dec 13, 2011 at 1:31 PM, Tim Triche, Jr. <tim.triche at="" gmail.com=""> wrote: > Supposing I wanted to do this with several hundred tumors to look for > hotspots in certain subtypes, is there a convenient way to do this > internally with one or two great big GRanges? ?Or is this the sort of > situation where I should use a genoset? > > Example CNV.GR is a GRanges for a tumor type, built from GISTIC CNV calls > achieving regional statistical significance): > >> EZH2 <- getCNV('EZH2', CNV.GR) # subsetByOverlapsCNV.GR, getExons(gene, > basesUp)) >> head(EZH2) > GRanges with 6 ranges and 1 elementMetadata value: > ? ? ? ? ? ? ? seqnames ? ? ? ? ? ? ? ? ranges strand | ? ? score > ? ? ? ? ? ? ? ? ?<rle> ? ? ? ? ? ? ?<iranges> ?<rle> | <numeric> > ?2915.Del.170 ? ? chr7 [140579242, 159118025] ? ? ?* | ? ? ? ?-1 > ? 2952.Del.93 ? ? chr7 [141787637, 157040032] ? ? ?* | ? ? ? ?-1 > ?2929.Del.258 ? ? chr7 [141956277, 148251831] ? ? ?* | ? ? ? ?-1 > ? 2874.Del.44 ? ? chr7 [142484285, 159127005] ? ? ?* | ? ? ? ?-1 > ? 2857.Del.97 ? ? chr7 [142486080, 154885611] ? ? ?* | ? ? ? ?-1 > ? 2805.Del.78 ? ? chr7 [142486080, 154970916] ? ? ?* | ? ? ? ?-1 >> sum(score(EZH2)) > [1] -22 > > So exons of EZH2 are frequently deleted in the particular set of samples > I'm looking at. ?Myc is amplified a great deal. ?etc. But I knew what I > wanted to look for in those cases. ?Suppose I instead want to blob together > and scan various tumors by common mutations, or known subtypes of driver > mutations. ?It's handy for me to just slap a +1 or -1 in the 'score' column > and add it up. ?It would be handier still for me to do running tallies of > this... it seems like the sort of thing that people would have solved years > ago, I just don't know precisely what to look for in terms of documentation > for it. > > Any assistance is appreciated. > > --t > > > > On Tue, Dec 13, 2011 at 8:55 AM, Daniel Brewer <daniel.brewer at="" icr.ac.uk="">wrote: > >> Thanks Martin, >> >> Yes you're right a simple intersect works great. ?I had a suspicion I was >> missing something very obvious. ?Many thanks. >> >> Dan >> >> On 13 Dec 2011, at 4:49 PM, Martin Morgan wrote: >> >> > On 12/13/2011 07:00 AM, Daniel Brewer wrote: >> >> Dear mailing list, >> >> >> >> I have two GRanges objects each containing segments of copy number >> >> change for a tumour. ?I would like to find regions where there is >> >> copy number change in both the tumours. ?I though findOverlaps was >> >> giving me what I wanted but soon found that findOverlaps(A,B) != >> >> findOverlaps(B,A). ?I am sure there must be a reasonable way to do >> >> this but I cannot find it. >> > >> > Hi Dan -- >> > >> >> a = GRanges("chrA", IRanges(11, 20), seqlengths=c(chrA=100)) >> >> b = GRanges("chrA", IRanges(16, 25), seqlengths=c(chrA=100)) >> >> intersect(a, b) >> > GRanges with 1 range and 0 elementMetadata values: >> > ? ? ? seqnames ? ?ranges strand >> > ? ? ? ? ?<rle> <iranges> ?<rle> >> > ? [1] ? ? chrA ?[16, 20] ? ? ?* >> > ? --- >> > ? seqlengths: >> > ? ?chrA >> > ? ? 100 >> > >> > but maybe a working example from your end would help clarify what you're >> > after / where things go wrong for you with findOverlaps. >> > >> > Martin >> > >> >> >> >> Thanks >> >> >> >> Dan >> >> >> > >> > >> > -- >> > Computational Biology >> > Fred Hutchinson Cancer Research Center >> > 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 >> > >> > Location: M1-B861 >> > Telephone: 206 667-2793 >> >> -- >> ************************************************************** >> >> Daniel Brewer, Ph.D. >> >> Institute of Cancer Research >> Molecular Carcinogenesis >> MUCRC >> 15 Cotswold Road >> Sutton, Surrey SM2 5NG >> United Kingdom >> >> Tel: +44 (0) 20 8722 4109 >> >> ************************************************************** >> >> >> >> >> The Institute of Cancer Research: Royal Cancer Hospital, a charitable >> Company Limited by Guarantee, Registered in England under Company No. >> 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. >> >> This e-mail message is confidential and for use by the...{{dropped:20}} > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY

Login before adding your answer.

Traffic: 832 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6