SNP to Gene mapping (GWAS)
1
1
Entering edit mode
Tim Smith ★ 1.1k
@tim-smith-1532
Last seen 10.2 years ago
Hi, I had a list of SNPs that I wanted to map to genes. Is there any package in bioconductor that will let me do this? I looked at GenABEL, but it doesn't seem that I can do it with this package. thanks! [[alternative HTML version deleted]]
• 4.2k views
ADD COMMENT
0
Entering edit mode
@vincent-j-carey-jr-4
Last seen 10 weeks ago
United States
It is not really a well-posed question, but various resources exist. If you have rs numbers for SNPs, you can determine their genomic coordinates using the SNPlocs.Hsapiens.dbSNP.* metadata packages -- the latest uses build 130. You can then use the org.Hs.eg.db package to determine CHR, CHRLOC and CHRLOCEND using the Entrez definitions. A rather limited solution can be had if you put the following in your workspace rs2eg = function (rsid, chr, radius = 1e+05) { require(org.Hs.eg.db) if (!exists("getSNPlocs")) stop("you need a SNPlocs.Hsapiens.dbSNP.* package loaded") if (length(grep("chr", chr)) <= 0) stop("chr must be in form \"chr...\"") numid = as.numeric(gsub("rs", "", rsid)) snlocs = getSNPlocs(chr) loc = snlocs[snlocs[, 1] == numid, "loc"] cstr = gsub("chr", "", chr) if (!(cstr %in% keys(revmap(org.Hs.egCHR)))) stop("your chr is bad; check keys(org.Hs.egCHR)") gonc = get(cstr, revmap(org.Hs.egCHR)) allloc = mget(gonc, org.Hs.egCHRLOC) disamb = sapply(allloc, function(x) abs(as.numeric(x)[1])) allloc = na.omit(disamb) ok = which(abs(loc - allloc) <= radius) if (length(ok) > 0) return(names(allloc)[ok]) return(NULL) } > rs2eg("rs6060535", "chr20") [1] "6676" "8904" "9054" "9584" "10137" "80307" "140823" These are genes that have CHRLOC address within 100kb of the snp rs6060535. Other resources for genomic annotation are emerging in the IRanges/rtracklayer packages and these could surely be relevant for large-scale solutions. Note that the program given here is not vectorized but it would not be hard to extend it to something that is. Of course if you find attributes that are more pertinent to your question in the biomaRt facility, that might be better. Let's check: > mart = useMart("snp", dataset="hsapiens_snp") Checking attributes ... ok Checking filters ... ok > getBM(mart=mart, filters="refsnp", value="rs6060535", + attr=c("chr_name", "chrom_start", "associated_gene", "ensembl_gene_stable_id")) chr_name chrom_start associated_gene ensembl_gene_stable_id 1 20 34235522 NA ENSG00000214078 2 20 34235522 NA ENSG00000222460 3 20 34235522 NA ENSG00000244462 > mget(.Last.value[,4], org.Hs.egENSEMBL2EG, ifnotfound=NA) $ENSG00000214078 [1] "8904" "9054" "10137" $ENSG00000222460 [1] NA $ENSG00000244462 [1] NA This result is compatible with the rs2eg function in that it identifies stably named ensembl genes (but no 'associated' genes, sic) that coincide with the entrez genes found by my distance-parameterized search. > sessionInfo() R version 2.10.0 beta (2009-10-14 r50081) i386-apple-darwin9.8.0 locale: [1] C attached base packages: [1] stats graphics grDevices datasets tools utils methods [8] base other attached packages: [1] org.Hs.eg.db_2.3.6 RSQLite_0.7-3 DBI_0.2-4 [4] AnnotationDbi_1.7.20 Biobase_2.5.8 biomaRt_2.1.0 [7] weaver_1.11.1 codetools_0.2-2 digest_0.4.1 loaded via a namespace (and not attached): [1] RCurl_1.2-1 XML_2.6-0 On Wed, Oct 21, 2009 at 2:54 PM, Tim Smith <tim_smith_666 at="" yahoo.com=""> wrote: > Hi, > > I had a list of SNPs that I wanted to map to genes. Is there any package in bioconductor that will let me do this? I looked at GenABEL, but it doesn't seem that I can do it with this package. > thanks! > > > > ? ? ? ?[[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT
0
Entering edit mode
Thanks for that Vincent. The SNPs that I am using are associated with the NCBI build 36 (UCSC - hg18). However, if I use the 'org.Hs.eg.db' package, I might not get the right NCBI build. I had used the biomaRt package to get an archived build before, but I am not sure how to go about it for the code you suggested. thanks again! ________________________________ From: Vincent Carey <stvjc@channing.harvard.edu> Cc: bioc <bioconductor@stat.math.ethz.ch> Sent: Wed, October 21, 2009 3:33:53 PM Subject: Re: [BioC] SNP to Gene mapping (GWAS) It is not really a well-posed question, but various resources exist. If you have rs numbers for SNPs, you can determine their genomic coordinates using the SNPlocs.Hsapiens.dbSNP.* metadata packages -- the latest uses build 130. You can then use the org.Hs.eg.db package to determine CHR, CHRLOC and CHRLOCEND using the Entrez definitions. A rather limited solution can be had if you put the following in your workspace rs2eg = function (rsid, chr, radius = 1e+05) { require(org.Hs.eg.db) if (!exists("getSNPlocs")) stop("you need a SNPlocs.Hsapiens.dbSNP.* package loaded") if (length(grep("chr", chr)) <= 0) stop("chr must be in form \"chr...\"") numid = as.numeric(gsub("rs", "", rsid)) snlocs = getSNPlocs(chr) loc = snlocs[snlocs[, 1] == numid, "loc"] cstr = gsub("chr", "", chr) if (!(cstr %in% keys(revmap(org.Hs.egCHR)))) stop("your chr is bad; check keys(org.Hs.egCHR)") gonc = get(cstr, revmap(org.Hs.egCHR)) allloc = mget(gonc, org.Hs.egCHRLOC) disamb = sapply(allloc, function(x) abs(as.numeric(x)[1])) allloc = na.omit(disamb) ok = which(abs(loc - allloc) <= radius) if (length(ok) > 0) return(names(allloc)[ok]) return(NULL) } > rs2eg("rs6060535", "chr20") [1] "6676" "8904" "9054" "9584" "10137" "80307" "140823" These are genes that have CHRLOC address within 100kb of the snp rs6060535. Other resources for genomic annotation are emerging in the IRanges/rtracklayer packages and these could surely be relevant for large-scale solutions. Note that the program given here is not vectorized but it would not be hard to extend it to something that is. Of course if you find attributes that are more pertinent to your question in the biomaRt facility, that might be better. Let's check: > mart = useMart("snp", dataset="hsapiens_snp") Checking attributes ... ok Checking filters ... ok > getBM(mart=mart, filters="refsnp", value="rs6060535", + attr=c("chr_name", "chrom_start", "associated_gene", "ensembl_gene_stable_id")) chr_name chrom_start associated_gene ensembl_gene_stable_id 1 20 34235522 NA ENSG00000214078 2 20 34235522 NA ENSG00000222460 3 20 34235522 NA ENSG00000244462 > mget(.Last.value[,4], org.Hs.egENSEMBL2EG, ifnotfound=NA) $ENSG00000214078 [1] "8904" "9054" "10137" $ENSG00000222460 [1] NA $ENSG00000244462 [1] NA This result is compatible with the rs2eg function in that it identifies stably named ensembl genes (but no 'associated' genes, sic) that coincide with the entrez genes found by my distance-parameterized search. > sessionInfo() R version 2.10.0 beta (2009-10-14 r50081) i386-apple-darwin9.8.0 locale: [1] C attached base packages: [1] stats graphics grDevices datasets tools utils methods [8] base other attached packages: [1] org.Hs.eg.db_2.3.6 RSQLite_0.7-3 DBI_0.2-4 [4] AnnotationDbi_1.7.20 Biobase_2.5.8 biomaRt_2.1.0 [7] weaver_1.11.1 codetools_0.2-2 digest_0.4.1 loaded via a namespace (and not attached): [1] RCurl_1.2-1 XML_2.6-0 > Hi, > > I had a list of SNPs that I wanted to map to genes. Is there any package in bioconductor that will let me do this? I looked at GenABEL, but it doesn't seem that I can do it with this package. > thanks! > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
I don't quite follow this question. Perhaps folks more expert in annotation and versioning can comment. It seems you would want to swap out the CHRLOC information in my function for something else, and various things that are quite new in the Biostrings/GenomicFeatures packages should help with increasing flexibility for this part of the task. Some sort of very high level computation like an intersection of a snp set and a gene set should be possible to meet your objective, but I don't have the constructs under my belt yet. On Sun, Oct 25, 2009 at 9:19 AM, Tim Smith <tim_smith_666 at="" yahoo.com=""> wrote: > Thanks for that Vincent. > The SNPs that I am using are associated with the NCBI build 36 (UCSC - > hg18). However, if I use the 'org.Hs.eg.db' package, I might not get the > right NCBI build. > I had used the biomaRt package to get an archived build before, but I am not > sure how to go about it for the code you suggested. > thanks again! > ________________________________ > From: Vincent Carey <stvjc at="" channing.harvard.edu=""> > To: Tim Smith <tim_smith_666 at="" yahoo.com=""> > Cc: bioc <bioconductor at="" stat.math.ethz.ch=""> > Sent: Wed, October 21, 2009 3:33:53 PM > Subject: Re: [BioC] SNP to Gene mapping (GWAS) > > It is not really a well-posed question, but various resources exist. > > If you have rs numbers for SNPs, you can determine their genomic > coordinates using the SNPlocs.Hsapiens.dbSNP.* metadata packages -- > the latest uses build 130.? You can then use the org.Hs.eg.db package > to determine CHR, CHRLOC and CHRLOCEND using the Entrez definitions. > > A rather limited solution can be had if you put the following in your > workspace > > rs2eg = function (rsid, chr, radius = 1e+05) > { > ? ? require(org.Hs.eg.db) > ? ? if (!exists("getSNPlocs")) > ? ? ? ? stop("you need a SNPlocs.Hsapiens.dbSNP.* package loaded") > ? ? if (length(grep("chr", chr)) <= 0) > ? ? ? ? stop("chr must be in form \"chr...\"") > ? ? numid = as.numeric(gsub("rs", "", rsid)) > ? ? snlocs = getSNPlocs(chr) > ? ? loc = snlocs[snlocs[, 1] == numid, "loc"] > ? ? cstr = gsub("chr", "", chr) > ? ? if (!(cstr %in% keys(revmap(org.Hs.egCHR)))) > ? ? ? ? stop("your chr is bad; check keys(org.Hs.egCHR)") > ? ? gonc = get(cstr, revmap(org.Hs.egCHR)) > ? ? allloc = mget(gonc, org.Hs.egCHRLOC) > ? ? disamb = sapply(allloc, function(x) abs(as.numeric(x)[1])) > ? ? allloc = na.omit(disamb) > ? ? ok = which(abs(loc - allloc) <= radius) > ? ? if (length(ok) > 0) > ? ? ? ? return(names(allloc)[ok]) > ? ? return(NULL) > } > >> rs2eg("rs6060535", "chr20") > [1] "6676"? "8904"? "9054"? "9584"? "10137"? "80307"? "140823" > > These are genes that have CHRLOC address within 100kb of the snp rs6060535. > Other resources for genomic annotation are emerging in the > IRanges/rtracklayer packages and these could surely be relevant for > large-scale solutions.? Note that the program given here is not > vectorized but it would not be hard to extend it to something that is. > > Of course if you find attributes that are more pertinent to your > question in the biomaRt facility, that might be better.? Let's check: > >> mart = useMart("snp", dataset="hsapiens_snp") > Checking attributes ... ok > Checking filters ... ok >> getBM(mart=mart, filters="refsnp", value="rs6060535", > +? attr=c("chr_name", "chrom_start", "associated_gene", > "ensembl_gene_stable_id")) > ? chr_name chrom_start associated_gene ensembl_gene_stable_id > 1? ? ? 20? ? 34235522? ? ? ? ? ? ? NA? ? ? ? ENSG00000214078 > 2? ? ? 20? ? 34235522? ? ? ? ? ? ? NA? ? ? ? ENSG00000222460 > 3? ? ? 20? ? 34235522? ? ? ? ? ? ? NA? ? ? ? ENSG00000244462 >> mget(.Last.value[,4], org.Hs.egENSEMBL2EG, ifnotfound=NA) > $ENSG00000214078 > [1] "8904"? "9054"? "10137" > > $ENSG00000222460 > [1] NA > > $ENSG00000244462 > [1] NA > > This result is compatible with the rs2eg function in that it > identifies stably named ensembl genes (but no 'associated' genes, sic) > that coincide with the entrez genes found by my distance- parameterized > search. > >> sessionInfo() > R version 2.10.0 beta (2009-10-14 r50081) > i386-apple-darwin9.8.0 > > locale: > [1] C > > attached base packages: > [1] stats? ? graphics? grDevices datasets? tools? ? utils? ? methods > [8] base > > other attached packages: > [1] org.Hs.eg.db_2.3.6? RSQLite_0.7-3? ? ? ? DBI_0.2-4 > [4] AnnotationDbi_1.7.20 Biobase_2.5.8? ? ? ? biomaRt_2.1.0 > [7] weaver_1.11.1? ? ? ? codetools_0.2-2? ? ? digest_0.4.1 > > loaded via a namespace (and not attached): > [1] RCurl_1.2-1 XML_2.6-0 > > > > On Wed, Oct 21, 2009 at 2:54 PM, Tim Smith <tim_smith_666 at="" yahoo.com=""> wrote: >> Hi, >> >> I had a list of SNPs that I wanted to map to genes. Is there any package >> in bioconductor that will let me do this? I looked at GenABEL, but it >> doesn't seem that I can do it with this package. >> thanks! >> >> >> >> ? ? ? ?[[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > >
ADD REPLY
0
Entering edit mode
On Sun, Oct 25, 2009 at 9:19 AM, Tim Smith <tim_smith_666@yahoo.com> wrote: > Thanks for that Vincent. > > The SNPs that I am using are associated with the NCBI build 36 (UCSC - > hg18). However, if I use the 'org.Hs.eg.db' package, I might not get the > right NCBI build. > I had used the biomaRt package to get an archived build before, but I am > not sure how to go about it for the code you suggested. > thanks again! > > The org.Hs.eg.db is based on hg18, currently. You can certainly get the gene starts and ends from other sources, though, if you like. Sean > > > ________________________________ > From: Vincent Carey <stvjc@channing.harvard.edu> > > Cc: bioc <bioconductor@stat.math.ethz.ch> > Sent: Wed, October 21, 2009 3:33:53 PM > Subject: Re: [BioC] SNP to Gene mapping (GWAS) > > It is not really a well-posed question, but various resources exist. > > If you have rs numbers for SNPs, you can determine their genomic > coordinates using the SNPlocs.Hsapiens.dbSNP.* metadata packages -- > the latest uses build 130. You can then use the org.Hs.eg.db package > to determine CHR, CHRLOC and CHRLOCEND using the Entrez definitions. > > A rather limited solution can be had if you put the following in your > workspace > > rs2eg = function (rsid, chr, radius = 1e+05) > { > require(org.Hs.eg.db) > if (!exists("getSNPlocs")) > stop("you need a SNPlocs.Hsapiens.dbSNP.* package loaded") > if (length(grep("chr", chr)) <= 0) > stop("chr must be in form \"chr...\"") > numid = as.numeric(gsub("rs", "", rsid)) > snlocs = getSNPlocs(chr) > loc = snlocs[snlocs[, 1] == numid, "loc"] > cstr = gsub("chr", "", chr) > if (!(cstr %in% keys(revmap(org.Hs.egCHR)))) > stop("your chr is bad; check keys(org.Hs.egCHR)") > gonc = get(cstr, revmap(org.Hs.egCHR)) > allloc = mget(gonc, org.Hs.egCHRLOC) > disamb = sapply(allloc, function(x) abs(as.numeric(x)[1])) > allloc = na.omit(disamb) > ok = which(abs(loc - allloc) <= radius) > if (length(ok) > 0) > return(names(allloc)[ok]) > return(NULL) > } > > > rs2eg("rs6060535", "chr20") > [1] "6676" "8904" "9054" "9584" "10137" "80307" "140823" > > These are genes that have CHRLOC address within 100kb of the snp rs6060535. > Other resources for genomic annotation are emerging in the > IRanges/rtracklayer packages and these could surely be relevant for > large-scale solutions. Note that the program given here is not > vectorized but it would not be hard to extend it to something that is. > > Of course if you find attributes that are more pertinent to your > question in the biomaRt facility, that might be better. Let's check: > > > mart = useMart("snp", dataset="hsapiens_snp") > Checking attributes ... ok > Checking filters ... ok > > getBM(mart=mart, filters="refsnp", value="rs6060535", > + attr=c("chr_name", "chrom_start", "associated_gene", > "ensembl_gene_stable_id")) > chr_name chrom_start associated_gene ensembl_gene_stable_id > 1 20 34235522 NA ENSG00000214078 > 2 20 34235522 NA ENSG00000222460 > 3 20 34235522 NA ENSG00000244462 > > mget(.Last.value[,4], org.Hs.egENSEMBL2EG, ifnotfound=NA) > $ENSG00000214078 > [1] "8904" "9054" "10137" > > $ENSG00000222460 > [1] NA > > $ENSG00000244462 > [1] NA > > This result is compatible with the rs2eg function in that it > identifies stably named ensembl genes (but no 'associated' genes, sic) > that coincide with the entrez genes found by my distance- parameterized > search. > > > sessionInfo() > R version 2.10.0 beta (2009-10-14 r50081) > i386-apple-darwin9.8.0 > > locale: > [1] C > > attached base packages: > [1] stats graphics grDevices datasets tools utils methods > [8] base > > other attached packages: > [1] org.Hs.eg.db_2.3.6 RSQLite_0.7-3 DBI_0.2-4 > [4] AnnotationDbi_1.7.20 Biobase_2.5.8 biomaRt_2.1.0 > [7] weaver_1.11.1 codetools_0.2-2 digest_0.4.1 > > loaded via a namespace (and not attached): > [1] RCurl_1.2-1 XML_2.6-0 > > > > > > Hi, > > > > I had a list of SNPs that I wanted to map to genes. Is there any package > in bioconductor that will let me do this? I looked at GenABEL, but it > doesn't seem that I can do it with this package. > > thanks! > > > > > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hi Vincent, Sean, I have not stated the problem clearly. My fault! I have some SNPs from the WTCCC dataset. As I understand, the annotations for this dataset were made with NCBI build 36, which corresponds to hg18 from UCSC. What I want to do is for each of these SNPs find out the genes associated with it. Vincent's method will get these for me, but whenever org.Hs.eg.db is updated to hg19, I'll get incorrect results. Most likely, I'd need to check if the package has been updated every time I run the code. Not impossible, but I'd like to avoid it if I can. The other alternative is to use some of the archived marts - this might ensure that the same version of the database is accessed every time I run the code. For example, to get the values for 'rs3094315', something like this might work (though it will only give the stable ids): -------------------------------------------------------- mart49 = useMart("snp_mart_49", dataset = "hsapiens_snp", archive = TRUE) getBM(mart=mart49, filters="refsnp", value="rs3094315", attr=c("chr_name", "chrom_start", "ensembl_gene_stable_id")) ------------------------------------------------------- However, this seems to be giving an error: Error in listFilters(mart, what = "type") : The function argument 'what' contains an invalid value: type Valid are: name, description, options, fullDescription, filters5, filters6 I checked the filters, and 'refsnp' should work for this. Right? Or am I going about this in a totally wrong manner? [[elided Yahoo spam]] Tim > sessionInfo() R version 2.9.1 (2009-06-26) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] splines grid stats graphics grDevices utils datasets methods [9] base other attached packages: [1] SNPlocs.Hsapiens.dbSNP.20071016_0.99.1 SNPlocs.Hsapiens.dbSNP.20080617_0.99.1 [3] SNPMaP_1.0.2 R.huge_0.2.0 [5] R.utils_1.2.2 R.oo_1.6.2 [7] R.methodsS3_1.0.3 affxparser_1.16.0 [9] SNPMaP.cdm_1.0.0 SNPassoc_1.6-0 [11] mvtnorm_0.9-7 survival_2.35-4 [13] haplo.stats_1.4.4 gplots_2.7.1 [15] caTools_1.9 gdata_2.6.1 [17] gtools_2.6.1 lattice_0.17-25 [19] org.Hs.eg.db_2.2.11 GOstats_2.10.0 [21] graph_1.22.2 Category_2.10.1 [23] GO.db_2.2.11 biomaRt_2.0.0 [25] multtest_2.1.1 XML_2.5-3 [27] genefilter_1.24.2 hgu133a.db_2.2.12 [29] RSQLite_0.7-1 DBI_0.2-4 [31] AnnotationDbi_1.6.1 GEOquery_2.8.0 [33] RCurl_0.98-1 bitops_1.0-4.1 [35] affy_1.22.0 Biobase_2.4.1 loaded via a namespace (and not attached): [1] affyio_1.12.0 annotate_1.22.0 GSEABase_1.6.1 MASS_7.2-47 [5] preprocessCore_1.6.0 RBGL_1.20.0 tools_2.9.1 xtable_1.5-5 > ________________________________ From: Sean Davis <seandavi@gmail.com> Cc: Vincent Carey <stvjc@channing.harvard.edu>; bioc <bioconductor@stat.math.ethz.ch> Sent: Sun, October 25, 2009 1:13:00 PM Subject: Re: [BioC] SNP to Gene mapping (GWAS) >Thanks for that Vincent. > >>The SNPs that I am using are associated with the NCBI build 36 (UCSC - hg18). However, if I use the 'org.Hs.eg.db' package, I might not get the right NCBI build. >>I had used the biomaRt package to get an archived build before, but I am not sure how to go about it for the code you suggested. >>thanks again! > > The org.Hs.eg.db is based on hg18, currently. You can certainly get the gene starts and ends from other sources, though, if you like. Sean > > >>________________________________ >>From: Vincent Carey <stvjc@channing.harvard.edu> > >>Cc: bioc <bioconductor@stat.math.ethz.ch> >>Sent: Wed, October 21, 2009 3:33:53 PM >>Subject: Re: [BioC] SNP to Gene mapping (GWAS) > > >>It is not really a well-posed question, but various resources exist. > >>If you have rs numbers for SNPs, you can determine their genomic >>coordinates using the SNPlocs.Hsapiens.dbSNP.* metadata packages -- >>the latest uses build 130. You can then use the org.Hs.eg.db package >>to determine CHR, CHRLOC and CHRLOCEND using the Entrez definitions. > >>A rather limited solution can be had if you put the following in your workspace > >>rs2eg = function (rsid, chr, radius = 1e+05) >>{ >> require(org.Hs.eg.db) >> if (!exists("getSNPlocs")) >> stop("you need a SNPlocs.Hsapiens.dbSNP.* package loaded") >> if (length(grep("chr", chr)) <= 0) >> stop("chr must be in form \"chr...\"") >> numid = as.numeric(gsub("rs", "", rsid)) >> snlocs = getSNPlocs(chr) >> loc = snlocs[snlocs[, 1] == numid, "loc"] >> cstr = gsub("chr", "", chr) >> if (!(cstr %in% keys(revmap(org.Hs.egCHR)))) >> stop("your chr is bad; check keys(org.Hs.egCHR)") >> gonc = get(cstr, revmap(org.Hs.egCHR)) >> allloc = mget(gonc, org.Hs.egCHRLOC) >> disamb = sapply(allloc, function(x) abs(as.numeric(x)[1])) >> allloc = na.omit(disamb) >> ok = which(abs(loc - allloc) <= radius) >> if (length(ok) > 0) >> return(names(allloc)[ok]) >> return(NULL) >>} > >>> rs2eg("rs6060535", "chr20") >>[1] "6676" "8904" "9054" "9584" "10137" "80307" "140823" > >>These are genes that have CHRLOC address within 100kb of the snp rs6060535. >>Other resources for genomic annotation are emerging in the >>IRanges/rtracklayer packages and these could surely be relevant for >>large-scale solutions. Note that the program given here is not >>vectorized but it would not be hard to extend it to something that is. > >>Of course if you find attributes that are more pertinent to your >>question in the biomaRt facility, that might be better. Let's check: > >>> mart = useMart("snp", dataset="hsapiens_snp") >>Checking attributes ... ok >>Checking filters ... ok >>> getBM(mart=mart, filters="refsnp", value="rs6060535", >>+ attr=c("chr_name", "chrom_start", "associated_gene", >>"ensembl_gene_stable_id")) >> chr_name chrom_start associated_gene ensembl_gene_stable_id >>1 20 34235522 NA ENSG00000214078 >>2 20 34235522 NA ENSG00000222460 >>3 20 34235522 NA ENSG00000244462 >>> mget(.Last.value[,4], org.Hs.egENSEMBL2EG, ifnotfound=NA) >>$ENSG00000214078 >>[1] "8904" "9054" "10137" > >>$ENSG00000222460 >>[1] NA > >>$ENSG00000244462 >>[1] NA > >>This result is compatible with the rs2eg function in that it >>identifies stably named ensembl genes (but no 'associated' genes, sic) >>that coincide with the entrez genes found by my distance- parameterized >>search. > >>> sessionInfo() >>R version 2.10.0 beta (2009-10-14 r50081) >>i386-apple-darwin9.8.0 > >>locale: >>[1] C > >>attached base packages: >>[1] stats graphics grDevices datasets tools utils methods >>[8] base > >>other attached packages: >>[1] org.Hs.eg.db_2.3.6 RSQLite_0.7-3 DBI_0.2-4 >>[4] AnnotationDbi_1.7.20 Biobase_2.5.8 biomaRt_2.1.0 >>[7] weaver_1.11.1 codetools_0.2-2 digest_0.4.1 > >>loaded via a namespace (and not attached): >>[1] RCurl_1.2-1 XML_2.6-0 > > > > > >> Hi, >>> >>> I had a list of SNPs that I wanted to map to genes. Is there any package in bioconductor that will let me do this? I looked at GenABEL, but it doesn't seem that I can do it with this package. >>> thanks! >>> >>> >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor@stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >>> > > > > >> [[alternative HTML version deleted]] > >>_______________________________________________ >>Bioconductor mailing list >Bioconductor@stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >>Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
This works for me, will give you Vega 35 library(biomaRt) listMarts(host="may2009.archive.ensembl.org",path="/biomart/martservic e",archive=FALSE) mart=useMart("ENSEMBL_MART_SNP",dataset="hsapiens_snp", host="may2009.archive.ensembl.org",path="/biomart/martservice",archive =FALSE) getBM(mart=mart, filters="refsnp", value="rs3094315", attr=c("chr_name", "chrom_start", "ensembl_gene_stable_id")) your could could dump the parts of org.Hs that you want to a table and keep that... -----Original Message----- From: Tim Smith <tim_smith_666@yahoo.com> To: Sean Davis <seandavi at="" gmail.com=""> Cc: Vincent Carey <stvjc at="" channing.harvard.edu="">, bioc <bioconductor at="" stat.math.ethz.ch=""> Subject: Re: [BioC] SNP to Gene mapping (GWAS) Date: Sun, 25 Oct 2009 17:20:44 -0700 (PDT) getBM(mart=mart49, filters="refsnp", value="rs3094315", attr=c("chr_name", "chrom_start", "ensembl_gene_stable_id"))
ADD REPLY

Login before adding your answer.

Traffic: 622 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6