How to get closest gene around chromosome location? [biomaRt]
1
0
Entering edit mode
Guido Hooiveld ★ 3.9k
@guido-hooiveld-2020
Last seen 1 day ago
Wageningen University, Wageningen, the …
Dear list, Anyone a pointer / piece of code on how to find genes that are closest to a specific chromosomal location? I am using Biostrings and associated mouse genome package to find and localize specific sequences (putative TFBS). This works fine (thanks Herve for the clear vignette). I have now an output (list) like this: seqname start end strand patternID chr1 7196884 7196895 + GH_TEST chr1 15433465 15433476 + GH_TEST chr1 78251474 78251485 + GH_TEST chr1 82635484 82635495 + GH_TEST chr1 22603411 22603422 - GH_TEST chr1 34167820 34167831 - GH_TEST chr1 47227452 47227463 - GH_TEST Next I would like to know which gene is closest to the each entry, and what the distance is. For example, info like the first entry is situated 1234 bp upstream of the TSS of gene ENSMUSG000xx, and entry 2 is located 3456 bp downstream of the TSS of gene ENSMUSG00yy. Etc. I am thinking of using BiomaRt for this, but I don't know how to do this. Note: In the archive I did find a related thread but the functions given unfortunately do not give the info I am after. http://article.gmane.org/gmane.science.biology.informatics.conductor/5 09 7 Any suggestions are appreciated, Guido ------------------------------------------------ Guido Hooiveld, PhD Nutrition, Metabolism & Genomics Group Division of Human Nutrition Wageningen University Biotechnion, Bomenweg 2 NL-6703 HD Wageningen the Netherlands tel: (+)31 317 485788 fax: (+)31 317 483342 internet: http://nutrigene.4t.com <http: nutrigene.4t.com=""/> email: guido.hooiveld@wur.nl [[alternative HTML version deleted]]
Biostrings biomaRt Biostrings biomaRt • 2.4k views
ADD COMMENT
0
Entering edit mode
@sean-davis-490
Last seen 12 weeks ago
United States
On Mon, Oct 27, 2008 at 12:45 PM, Hooiveld, Guido <guido.hooiveld at="" wur.nl=""> wrote: > > Dear list, > Anyone a pointer / piece of code on how to find genes that are closest > to a specific chromosomal location? > > I am using Biostrings and associated mouse genome package to find and > localize specific sequences (putative TFBS). This works fine (thanks > Herve for the clear vignette). I have now an output (list) like this: > > seqname start end strand patternID > chr1 7196884 7196895 + GH_TEST > chr1 15433465 15433476 + GH_TEST > chr1 78251474 78251485 + GH_TEST > chr1 82635484 82635495 + GH_TEST > chr1 22603411 22603422 - GH_TEST > chr1 34167820 34167831 - GH_TEST > chr1 47227452 47227463 - GH_TEST > > Next I would like to know which gene is closest to the each entry, and > what the distance is. > For example, info like the first entry is situated 1234 bp upstream of > the TSS of gene ENSMUSG000xx, and entry 2 is located 3456 bp downstream > of the TSS of gene ENSMUSG00yy. Etc. > > I am thinking of using BiomaRt for this, but I don't know how to do > this. > > Note: In the archive I did find a related thread but the functions given > unfortunately do not give the info I am after. > http://article.gmane.org/gmane.science.biology.informatics.conductor /509 > 7 > > Any suggestions are appreciated, The findClosestGene() function in the ACME package will do this using RefSeq data from UCSC. Sean
ADD COMMENT
0
Entering edit mode
Thanks Sean for this pointer; I did not know of this function and I'll have a look. Guido > -----Original Message----- > From: seandavi at gmail.com [mailto:seandavi at gmail.com] On > Behalf Of Sean Davis > Sent: 27 October 2008 19:24 > To: Hooiveld, Guido > Cc: bioconductor at stat.math.ethz.ch > Subject: Re: [BioC] How to get closest gene around chromosome > location? [biomaRt] > > On Mon, Oct 27, 2008 at 12:45 PM, Hooiveld, Guido > <guido.hooiveld at="" wur.nl=""> wrote: > > > > Dear list, > > Anyone a pointer / piece of code on how to find genes that > are closest > > to a specific chromosomal location? > > > > I am using Biostrings and associated mouse genome package > to find and > > localize specific sequences (putative TFBS). This works > fine (thanks > > Herve for the clear vignette). I have now an output (list) > like this: > > > > seqname start end strand patternID > > chr1 7196884 7196895 + GH_TEST > > chr1 15433465 15433476 + GH_TEST > > chr1 78251474 78251485 + GH_TEST > > chr1 82635484 82635495 + GH_TEST > > chr1 22603411 22603422 - GH_TEST > > chr1 34167820 34167831 - GH_TEST > > chr1 47227452 47227463 - GH_TEST > > > > Next I would like to know which gene is closest to the each > entry, and > > what the distance is. > > For example, info like the first entry is situated 1234 bp > upstream of > > the TSS of gene ENSMUSG000xx, and entry 2 is located 3456 bp > > downstream of the TSS of gene ENSMUSG00yy. Etc. > > > > I am thinking of using BiomaRt for this, but I don't know how to do > > this. > > > > Note: In the archive I did find a related thread but the functions > > given unfortunately do not give the info I am after. > > > http://article.gmane.org/gmane.science.biology.informatics.conductor/5 > > 09 > > 7 > > > > Any suggestions are appreciated, > > The findClosestGene() function in the ACME package will do > this using RefSeq data from UCSC. > > Sean > >
ADD REPLY
0
Entering edit mode
Hi Sean, Concerning the getRefflat function, I can?t seem to get it to work with bos taurus nor human. Is it because I am having R version 2.6.0? > rf <- getRefflat('bosTau4') trying URL 'http://hgdownload.cse.ucsc.edu/goldenPath/bosTau4/database /refFlat.txt.gz' Content type 'application/x-gzip' length 906439 bytes (885 Kb) opened URL downloaded 885 Kb Error in pushBack(c(lines, lines), file) : can only push back on text-mode connections > sessionInfo() R version 2.6.0 (2007-10-03) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] ACME_1.8.0 Best regards, Jo?o Fadista -----Original Message----- From: bioconductor-bounces@stat.math.ethz.ch [mailto:bioconductor- bounces@stat.math.ethz.ch] On Behalf Of Sean Davis Sent: Monday, October 27, 2008 7:24 PM To: Hooiveld, Guido Cc: bioconductor at stat.math.ethz.ch Subject: Re: [BioC] How to get closest gene around chromosome location?[biomaRt] On Mon, Oct 27, 2008 at 12:45 PM, Hooiveld, Guido <guido.hooiveld at="" wur.nl=""> wrote: > > Dear list, > Anyone a pointer / piece of code on how to find genes that are closest > to a specific chromosomal location? > > I am using Biostrings and associated mouse genome package to find and > localize specific sequences (putative TFBS). This works fine (thanks > Herve for the clear vignette). I have now an output (list) like this: > > seqname start end strand patternID > chr1 7196884 7196895 + GH_TEST > chr1 15433465 15433476 + GH_TEST > chr1 78251474 78251485 + GH_TEST > chr1 82635484 82635495 + GH_TEST > chr1 22603411 22603422 - GH_TEST > chr1 34167820 34167831 - GH_TEST > chr1 47227452 47227463 - GH_TEST > > Next I would like to know which gene is closest to the each entry, and > what the distance is. > For example, info like the first entry is situated 1234 bp upstream of > the TSS of gene ENSMUSG000xx, and entry 2 is located 3456 bp > downstream of the TSS of gene ENSMUSG00yy. Etc. > > I am thinking of using BiomaRt for this, but I don't know how to do > this. > > Note: In the archive I did find a related thread but the functions > given unfortunately do not give the info I am after. > http://article.gmane.org/gmane.science.biology.informatics.conductor/5 > 09 > 7 > > Any suggestions are appreciated, The findClosestGene() function in the ACME package will do this using RefSeq data from UCSC. Sean _______________________________________________ Bioconductor mailing list Bioconductor at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
On Tue, Oct 28, 2008 at 4:57 AM, Jo?o Fadista <joao.fadista at="" agrsci.dk=""> wrote: > > Hi Sean, > > Concerning the getRefflat function, I can?t seem to get it to work with bos taurus nor human. Is it because I am having R version 2.6.0? > >> rf <- getRefflat('bosTau4') > trying URL 'http://hgdownload.cse.ucsc.edu/goldenPath/bosTau4/databa se/refFlat.txt.gz' > Content type 'application/x-gzip' length 906439 bytes (885 Kb) > opened URL > downloaded 885 Kb > > Error in pushBack(c(lines, lines), file) : > can only push back on text-mode connections Hi, Joao. ACME_1.8.0 is meant to be used with R-2.8.0. It looks like ACME_1.4.0 is appropriate for R-2.6.0 and it passes on the build report for R-2.6.0 on Windows. Sean >> sessionInfo() > R version 2.6.0 (2007-10-03) > i386-pc-mingw32 > > locale: > LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] ACME_1.8.0 > -----Original Message----- > From: bioconductor-bounces at stat.math.ethz.ch [mailto :bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Sean Davis > Sent: Monday, October 27, 2008 7:24 PM > To: Hooiveld, Guido > Cc: bioconductor at stat.math.ethz.ch > Subject: Re: [BioC] How to get closest gene around chromosome location?[biomaRt] > > On Mon, Oct 27, 2008 at 12:45 PM, Hooiveld, Guido <guido.hooiveld at="" wur.nl=""> wrote: >> >> Dear list, >> Anyone a pointer / piece of code on how to find genes that are closest >> to a specific chromosomal location? >> >> I am using Biostrings and associated mouse genome package to find and >> localize specific sequences (putative TFBS). This works fine (thanks >> Herve for the clear vignette). I have now an output (list) like this: >> >> seqname start end strand patternID >> chr1 7196884 7196895 + GH_TEST >> chr1 15433465 15433476 + GH_TEST >> chr1 78251474 78251485 + GH_TEST >> chr1 82635484 82635495 + GH_TEST >> chr1 22603411 22603422 - GH_TEST >> chr1 34167820 34167831 - GH_TEST >> chr1 47227452 47227463 - GH_TEST >> >> Next I would like to know which gene is closest to the each entry, and >> what the distance is. >> For example, info like the first entry is situated 1234 bp upstream of >> the TSS of gene ENSMUSG000xx, and entry 2 is located 3456 bp >> downstream of the TSS of gene ENSMUSG00yy. Etc. >> >> I am thinking of using BiomaRt for this, but I don't know how to do >> this. >> >> Note: In the archive I did find a related thread but the functions >> given unfortunately do not give the info I am after. >> http://article.gmane.org/gmane.science.biology.informatics.conductor/5 >> 09 >> 7 >> >> Any suggestions are appreciated, > > The findClosestGene() function in the ACME package will do this using RefSeq data from UCSC. > > Sean > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY

Login before adding your answer.

Traffic: 748 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6