Finding closest gene to chromosomal position (non-human or mouse)
2
0
Entering edit mode
Molly Hanlon ▴ 20
@molly-hanlon-5792
Last seen 9.6 years ago
Hi list, I've been using R for some time, but am new to bioconductor/biomaRt. Please pardon that. I have a list of a couple thousand SNPs and I'm looking for the closest gene to each of them. Looking just within the gene, I only get a result for about a third of them, so I obviously need something a bit more robust. Searching the archives, I've found mention of the findClosestGene in ACME, however, I work in rice, so this isn't much help. Ideally, I'd like to enter a position, say "9:12344567" and get an output that it lies 1234 bp upstream from Loc_OsXXXXXX. I've also seen suggestions that include getting a data frame of all the gene positions and then using findOverlap with my frame, though I'm not sure this would work because I have no strand information for my query and they're single base pairs, thus without a range. Feel free to correct my assumptions, but any additional help you can provide would be wonderful. Many thanks, Molly [[alternative HTML version deleted]]
• 2.3k views
ADD COMMENT
0
Entering edit mode
Tim Triche ★ 4.2k
@tim-triche-3561
Last seen 3.6 years ago
United States
if there is a TranscriptDb for rice, the VariantAnnotation package will be super handy. Given a GenomicRanges with (chromosome, start, end) for each variant (not necessarily a SNP, could be an indel or repeat region!), the package will find the nearest gene/transcript, the type of predicted consequence, and the relationship for each variant... IFF there is a TranscriptDb or GRanges representation of where these genes are. If there is not a TranscriptDb for rice, you might want to create one... VariantAnnotation is *that useful*. Just a thought! Best, --t On Mon, Feb 25, 2013 at 1:35 PM, Molly Hanlon <hanlonmt@gmail.com> wrote: > Hi list, > > I've been using R for some time, but am new to bioconductor/biomaRt. Please > pardon that. > > I have a list of a couple thousand SNPs and I'm looking for the closest > gene to each of them. Looking just within the gene, I only get a result > for about a third of them, so I obviously need something a bit more robust. > Searching the archives, I've found mention of the findClosestGene in ACME, > however, I work in rice, so this isn't much help. Ideally, I'd like to > enter a position, say "9:12344567" and get an output that it lies 1234 bp > upstream from Loc_OsXXXXXX. > > I've also seen suggestions that include getting a data frame of all the > gene positions and then using findOverlap with my frame, though I'm not > sure this would work because I have no strand information for my query and > they're single base pairs, thus without a range. > > Feel free to correct my assumptions, but any additional help you can > provide would be wonderful. > > Many thanks, > Molly > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -- *A model is a lie that helps you see the truth.* * * Howard Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
Julie Zhu ★ 4.3k
@julie-zhu-3596
Last seen 5 months ago
United States
Molly, You might want to try annotatePeakInBatch in ChIPpeakAnno package with parameter output="both". Best regards, Julie On 2/25/13 4:35 PM, "Molly Hanlon" <hanlonmt at="" gmail.com=""> wrote: > Hi list, > > I've been using R for some time, but am new to bioconductor/biomaRt. Please > pardon that. > > I have a list of a couple thousand SNPs and I'm looking for the closest > gene to each of them. Looking just within the gene, I only get a result > for about a third of them, so I obviously need something a bit more robust. > Searching the archives, I've found mention of the findClosestGene in ACME, > however, I work in rice, so this isn't much help. Ideally, I'd like to > enter a position, say "9:12344567" and get an output that it lies 1234 bp > upstream from Loc_OsXXXXXX. > > I've also seen suggestions that include getting a data frame of all the > gene positions and then using findOverlap with my frame, though I'm not > sure this would work because I have no strand information for my query and > they're single base pairs, thus without a range. > > Feel free to correct my assumptions, but any additional help you can > provide would be wonderful. > > Many thanks, > Molly > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
I asked a similar question last week and got the solution with the help of Valerie. I was looking for flanking genes and not necessarily the closest. If you are interested, see my blog at http://adairama.wordpress.com/2013/02/15/functionally-annotate-snps- and-indels-in-bioconductor/ Regards, Adai On Tue, Feb 26, 2013 at 1:21 PM, Zhu, Lihua (Julie) <julie.zhu@umassmed.edu>wrote: > Molly, > > You might want to try annotatePeakInBatch in ChIPpeakAnno package with > parameter output="both". > > Best regards, > > Julie > > > On 2/25/13 4:35 PM, "Molly Hanlon" <hanlonmt@gmail.com> wrote: > > > Hi list, > > > > I've been using R for some time, but am new to bioconductor/biomaRt. > Please > > pardon that. > > > > I have a list of a couple thousand SNPs and I'm looking for the closest > > gene to each of them. Looking just within the gene, I only get a result > > for about a third of them, so I obviously need something a bit more > robust. > > Searching the archives, I've found mention of the findClosestGene in > ACME, > > however, I work in rice, so this isn't much help. Ideally, I'd like to > > enter a position, say "9:12344567" and get an output that it lies 1234 bp > > upstream from Loc_OsXXXXXX. > > > > I've also seen suggestions that include getting a data frame of all the > > gene positions and then using findOverlap with my frame, though I'm not > > sure this would work because I have no strand information for my query > and > > they're single base pairs, thus without a range. > > > > Feel free to correct my assumptions, but any additional help you can > > provide would be wonderful. > > > > Many thanks, > > Molly > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@r-project.org > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > > http://news.gmane.org/gmane.science.biology.informatics.conductor > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Great! I was able to use annotatePeakInBatch as per Julie's recommendation to get what I was looking for. Thanks for the help! On Tue, Feb 26, 2013 at 9:10 AM, Adaikalavan Ramasamy < adaikalavan.ramasamy@gmail.com> wrote: > I asked a similar question last week and got the solution with the help of > Valerie. I was looking for flanking genes and not necessarily the closest. > If you are interested, see my blog at > http://adairama.wordpress.com/2013/02/15/functionally-annotate-snps- and-indels-in-bioconductor/ > > Regards, Adai > > > On Tue, Feb 26, 2013 at 1:21 PM, Zhu, Lihua (Julie) < > Julie.Zhu@umassmed.edu> wrote: > >> Molly, >> >> You might want to try annotatePeakInBatch in ChIPpeakAnno package with >> parameter output="both". >> >> Best regards, >> >> Julie >> >> >> On 2/25/13 4:35 PM, "Molly Hanlon" <hanlonmt@gmail.com> wrote: >> >> > Hi list, >> > >> > I've been using R for some time, but am new to bioconductor/biomaRt. >> Please >> > pardon that. >> > >> > I have a list of a couple thousand SNPs and I'm looking for the closest >> > gene to each of them. Looking just within the gene, I only get a result >> > for about a third of them, so I obviously need something a bit more >> robust. >> > Searching the archives, I've found mention of the findClosestGene in >> ACME, >> > however, I work in rice, so this isn't much help. Ideally, I'd like to >> > enter a position, say "9:12344567" and get an output that it lies 1234 >> bp >> > upstream from Loc_OsXXXXXX. >> > >> > I've also seen suggestions that include getting a data frame of all the >> > gene positions and then using findOverlap with my frame, though I'm not >> > sure this would work because I have no strand information for my query >> and >> > they're single base pairs, thus without a range. >> > >> > Feel free to correct my assumptions, but any additional help you can >> > provide would be wonderful. >> > >> > Many thanks, >> > Molly >> > >> > [[alternative HTML version deleted]] >> > >> > _______________________________________________ >> > Bioconductor mailing list >> > Bioconductor@r-project.org >> > https://stat.ethz.ch/mailman/listinfo/bioconductor >> > Search the archives: >> > http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 784 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6