distanceToNearest in GenomicRanges
1
0
Entering edit mode
Tom Oates ▴ 60
@tom-oates-5703
Last seen 6.7 years ago
Hi I am very much a learner in R in general & GenomicRanges in general I am struggling to find documentation to help me get my head around distanceToNearest in GenomicRanges If I have a GRanges object: GRanges with 6 ranges and 4 metadata columns: seqnames ranges strand | <rle> <iranges> <rle> | [1] 10 [ 96723746, 96723747] - | [2] 7 [ 13641170, 13641171] + | [3] 16 [ 17772801, 17772802] - | [4] 3 [ 88173502, 88173503] - | [5] 13 [106979682, 106979683] + | [6] 9 [104393139, 104393140] + | (You will notice that all the regions are only dinucleotides & I have removed the metadata ) I have a 2nd GRanges object which is ensembl rat transcripts as below: 39549 ranges and 2 metadata columns: seqnames ranges strand | tx_id tx_name <rle> <iranges> <rle> | <integer> <character> [1] 1 [5473, 16844] + | 1 ENSRNOT00000044270 [2] 1 [5526, 16968] + | 2 ENSRNOT00000049921 [3] 1 [5526, 16968] + | 3 ENSRNOT00000051735 [4] 1 [5598, 13520] + | 4 ENSRNOT00000034630 [5] 1 [8268, 16850] + | 5 ENSRNOT00000044505 [6] 1 [8316, 17577] + | 6 ENSRNOT00000042693 [7] 1 [8884, 16850] + | 7 ENSRNOT00000044187 [8] 1 [8956, 9955] + | 8 ENSRNOT00000041082 [9] 1 [9055, 17351] + | 9 ENSRNOT00000050254 If I invoke: xx<-distanceToNearestdiff.cpgs.gr, rat.transcripts, ignore.strand=F) xx DataFrame with 1133 rows and 3 columns queryHits subjectHits distance <integer> <integer> <integer> 1 1 7752 0 2 2 32166 11946 3 3 14678 25377 4 4 24286 66747 5 5 10609 34242 6 6 37076 122683 7 7 35184 0 8 8 34180 45561 9 9 19351 50156 ... ... ... ... etc I am uncertain how I would then use the xx output to gain information (i.e. tx_id, tx_name) about the feature which the function has identified as nearest? I would be happy to supply any more info as required Tom [[alternative HTML version deleted]]
GenomicRanges GenomicRanges • 2.9k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 6 hours ago
United States
Hi Tom, On 2/11/2013 11:35 AM, Tom Oates wrote: > Hi > I am very much a learner in R in general& GenomicRanges in general > I am struggling to find documentation to help me get my head around > distanceToNearest in GenomicRanges > If I have a GRanges object: > > GRanges with 6 ranges and 4 metadata columns: > seqnames ranges strand | > <rle> <iranges> <rle> | > [1] 10 [ 96723746, 96723747] - | > [2] 7 [ 13641170, 13641171] + | > [3] 16 [ 17772801, 17772802] - | > [4] 3 [ 88173502, 88173503] - | > [5] 13 [106979682, 106979683] + | > [6] 9 [104393139, 104393140] + | > > (You will notice that all the regions are only dinucleotides& I have > removed the metadata ) > > I have a 2nd GRanges object which is ensembl rat transcripts as below: > 39549 ranges and 2 metadata columns: > seqnames ranges strand | tx_id > tx_name > <rle> <iranges> <rle> |<integer> > <character> > [1] 1 [5473, 16844] + | 1 > ENSRNOT00000044270 > [2] 1 [5526, 16968] + | 2 > ENSRNOT00000049921 > [3] 1 [5526, 16968] + | 3 > ENSRNOT00000051735 > [4] 1 [5598, 13520] + | 4 > ENSRNOT00000034630 > [5] 1 [8268, 16850] + | 5 > ENSRNOT00000044505 > [6] 1 [8316, 17577] + | 6 > ENSRNOT00000042693 > [7] 1 [8884, 16850] + | 7 > ENSRNOT00000044187 > [8] 1 [8956, 9955] + | 8 > ENSRNOT00000041082 > [9] 1 [9055, 17351] + | 9 > ENSRNOT00000050254 > > > If I invoke: > xx<-distanceToNearestdiff.cpgs.gr, rat.transcripts, ignore.strand=F) > > xx > DataFrame with 1133 rows and 3 columns > queryHits subjectHits distance > <integer> <integer> <integer> > 1 1 7752 0 > 2 2 32166 11946 > 3 3 14678 25377 > 4 4 24286 66747 > 5 5 10609 34242 > 6 6 37076 122683 > 7 7 35184 0 > 8 8 34180 45561 > 9 9 19351 50156 > ... ... ... ... > etc > > I am uncertain how I would then use the xx output to gain information (i.e. > tx_id, tx_name) about the feature which the function has identified as > nearest? > I would be happy to supply any more info as required The subjectHits column gives the row of your transcript GRanges object that matches the corresponding query row. I am assuming here that the 'diff.cpgs.gr' GRanges object is longer than 6? Anyway, here is an example using your data and the TxDb.Mmusculus.UCSC.mm10.knownGene package: > x GRanges with 6 ranges and 0 metadata columns: seqnames ranges strand <rle> <iranges> <rle> [1] chr10 [ 96723746, 96723747] * [2] chr7 [ 13641170, 13641171] * [3] chr16 [ 17772801, 17772802] * [4] chr3 [ 88173502, 88173503] * [5] chr13 [106979682, 106979683] * [6] chr9 [104393139, 104393140] * --- > y <- transcripts(TxDb.Mmusculus.UCSC.mm10.knownGene) > xx <- distanceToNearest(x, y, ignore.strand=F) > xx DataFrame with 6 rows and 3 columns queryHits subjectHits distance <integer> <integer> <integer> 1 1 4514 100935 2 2 45653 0 3 3 19383 0 4 4 34197 0 5 5 14383 0 6 6 54212 8108 > y[xx[,2],] GRanges with 6 ranges and 2 metadata columns: seqnames ranges strand | tx_id tx_name <rle> <iranges> <rle> | <integer> <character> [1] chr10 [ 96617001, 96622811] + | 33419 uc007gww.2 [2] chr7 [ 13623967, 13670807] + | 21400 uc012ezp.1 [3] chr16 [ 17759663, 17779206] + | 48288 uc007ylz.1 [4] chr3 [ 88171560, 88177785] - | 10107 uc008puf.2 [5] chr13 [106963757, 107022114] - | 43288 uc007rue.1 [6] chr9 [104361832, 104385031] + | 29956 uc009rhp.1 --- seqlengths: chr1 chr2 ... chrUn_JH584304 195471971 182113224 ... 114452 Best, Jim > Tom > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD COMMENT
0
Entering edit mode
Thanks This works a treat & Is exactly what I was looking for Tom On Mon, Feb 11, 2013 at 6:08 PM, James W. MacDonald <jmacdon@uw.edu> wrote: > Hi Tom, > > > On 2/11/2013 11:35 AM, Tom Oates wrote: > >> Hi >> I am very much a learner in R in general& GenomicRanges in general >> >> I am struggling to find documentation to help me get my head around >> distanceToNearest in GenomicRanges >> If I have a GRanges object: >> >> GRanges with 6 ranges and 4 metadata columns: >> seqnames ranges strand | >> <rle> <iranges> <rle> | >> [1] 10 [ 96723746, 96723747] - | >> [2] 7 [ 13641170, 13641171] + | >> [3] 16 [ 17772801, 17772802] - | >> [4] 3 [ 88173502, 88173503] - | >> [5] 13 [106979682, 106979683] + | >> [6] 9 [104393139, 104393140] + | >> >> (You will notice that all the regions are only dinucleotides& I have >> >> removed the metadata ) >> >> I have a 2nd GRanges object which is ensembl rat transcripts as below: >> 39549 ranges and 2 metadata columns: >> seqnames ranges strand | tx_id >> tx_name >> <rle> <iranges> <rle> |<integer> >> <character> >> [1] 1 [5473, 16844] + | 1 >> ENSRNOT00000044270 >> [2] 1 [5526, 16968] + | 2 >> ENSRNOT00000049921 >> [3] 1 [5526, 16968] + | 3 >> ENSRNOT00000051735 >> [4] 1 [5598, 13520] + | 4 >> ENSRNOT00000034630 >> [5] 1 [8268, 16850] + | 5 >> ENSRNOT00000044505 >> [6] 1 [8316, 17577] + | 6 >> ENSRNOT00000042693 >> [7] 1 [8884, 16850] + | 7 >> ENSRNOT00000044187 >> [8] 1 [8956, 9955] + | 8 >> ENSRNOT00000041082 >> [9] 1 [9055, 17351] + | 9 >> ENSRNOT00000050254 >> >> >> If I invoke: >> xx<-distanceToNearest(diff.**cpgs.gr <http: diff.cpgs.gr="">, >> rat.transcripts, ignore.strand=F) >> >> xx >> DataFrame with 1133 rows and 3 columns >> queryHits subjectHits distance >> <integer> <integer> <integer> >> 1 1 7752 0 >> 2 2 32166 11946 >> 3 3 14678 25377 >> 4 4 24286 66747 >> 5 5 10609 34242 >> 6 6 37076 122683 >> 7 7 35184 0 >> 8 8 34180 45561 >> 9 9 19351 50156 >> ... ... ... ... >> etc >> >> I am uncertain how I would then use the xx output to gain information >> (i.e. >> tx_id, tx_name) about the feature which the function has identified as >> nearest? >> I would be happy to supply any more info as required >> > > The subjectHits column gives the row of your transcript GRanges object > that matches the corresponding query row. I am assuming here that the ' > diff.cpgs.gr' GRanges object is longer than 6? Anyway, here is an example > using your data and the TxDb.Mmusculus.UCSC.mm10.**knownGene package: > > > > x > GRanges with 6 ranges and 0 metadata columns: > seqnames ranges strand > <rle> <iranges> <rle> > [1] chr10 [ 96723746, 96723747] * > [2] chr7 [ 13641170, 13641171] * > [3] chr16 [ 17772801, 17772802] * > [4] chr3 [ 88173502, 88173503] * > [5] chr13 [106979682, 106979683] * > [6] chr9 [104393139, 104393140] * > --- > > y <- transcripts(TxDb.Mmusculus.**UCSC.mm10.knownGene) > > xx <- distanceToNearest(x, y, ignore.strand=F) > > xx > DataFrame with 6 rows and 3 columns > > queryHits subjectHits distance > <integer> <integer> <integer> > 1 1 4514 100935 > 2 2 45653 0 > 3 3 19383 0 > 4 4 34197 0 > 5 5 14383 0 > 6 6 54212 8108 > > > > y[xx[,2],] > GRanges with 6 ranges and 2 metadata columns: > > seqnames ranges strand | tx_id tx_name > <rle> <iranges> <rle> | <integer> <character> > [1] chr10 [ 96617001, 96622811] + | 33419 uc007gww.2 > [2] chr7 [ 13623967, 13670807] + | 21400 uc012ezp.1 > [3] chr16 [ 17759663, 17779206] + | 48288 uc007ylz.1 > [4] chr3 [ 88171560, 88177785] - | 10107 uc008puf.2 > [5] chr13 [106963757, 107022114] - | 43288 uc007rue.1 > [6] chr9 [104361832, 104385031] + | 29956 uc009rhp.1 > --- > seqlengths: > chr1 chr2 ... chrUn_JH584304 > 195471971 182113224 ... 114452 > > Best, > > Jim > > > Tom >> >> [[alternative HTML version deleted]] >> >> ______________________________**_________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.e="" thz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: http://news.gmane.org/gmane.** >> science.biology.informatics.**conductor<http: news.gmane.org="" gmane="" .science.biology.informatics.conductor=""> >> > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 630 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6