Question: BioMaRt query
0
gravatar for René Dreos
9.0 years ago by
René Dreos80
René Dreos80 wrote:
Dear BioC mailing list, I am trying to annotate Arabidopsis ATH1 genome array results using biomaRt, but it looks like some of the probesets are not annotated in biomaRt database. Here is one example: > library(biomaRt) > AT.db <- useMart(biomart="plant_mart_6", dataset="athaliana_eg_gene") > getBM(attributes = c("affy_ath1_121501","ensembl_gene_id","description"), filters = "affy_ath1_121501", values = "254998_at", mart = AT.db) [1] affy_ath1_121501 ensembl_gene_id description <0 rows> (or 0-length row.names) But if I use ath1121501.db library to annotate the same probeset: > library(annotate) > library(ath1121501.db) > mget("254998_at", env=ath1121501GENENAME) $`254998_at` [1] "encodes a choline synthase whose gene expression is induced by high salt and mannitol." > mget("254998_at", env=ath1121501ACCNUM) $`254998_at` [1] "AT4G09760" Why is this happening? Thank you for any advice, best regards r > sessionInfo() R version 2.11.1 (2010-05-31) x86_64-apple-darwin9.8.0 locale: [1] C attached base packages: [1] grid stats graphics grDevices utils datasets methods [8] base other attached packages: [1] ath1121501.db_2.4.1 org.At.tair.db_2.4.3 [3] RSQLite_0.9-2 annotate_1.26.1 [5] ath1121501cdf_2.6.0 biomaRt_2.4.0 [7] genefilter_1.30.0 marray_1.26.0 [9] gplots_2.8.0 caTools_1.10 [11] bitops_1.0-4.1 gdata_2.7.2 [13] gtools_2.6.2 bradiar1b520742cdf_1.24.0 [15] arrayQualityMetrics_2.6.0 affyPLM_1.24.1 [17] gcrma_2.20.0 preprocessCore_1.10.0 [19] matchprobes_1.20.0 Biostrings_2.16.9 [21] IRanges_1.6.15 AnnotationDbi_1.10.2 [23] affxparser_1.20.0 makecdfenv_1.26.0 [25] lattice_0.18-8 RMySQL_0.7-5 [27] DBI_0.2-5 affy_1.26.1 [29] Biobase_2.8.0 limma_3.4.4 loaded via a namespace (and not attached): [1] RColorBrewer_1.0-2 RCurl_1.4-2 XML_3.1-1 [4] affyio_1.16.0 beadarray_1.16.0 hwriter_1.2 [7] latticeExtra_0.6-14 simpleaffy_2.24.0 splines_2.11.1 [10] stats4_2.11.1 survival_2.35-8 tools_2.11.1 [13] vsn_3.16.0 xtable_1.5-6 [[alternative HTML version deleted]]
ath1121501 annotate • 588 views
ADD COMMENTlink modified 9.0 years ago by James W. MacDonald51k • written 9.0 years ago by René Dreos80
Answer: BioMaRt query
0
gravatar for Kasper Daniel Hansen
9.0 years ago by
United States
Kasper Daniel Hansen6.4k wrote:
When you use biomaRt you are querying Ensembl. Ensembl remaps all probesets independently of Affymetrix. The *.db package reflects the (current at the time of build) annotation available from Affymetrix. So for some reason Ensembl has decided that this particular probeset does not map to a gene. You will need to track down how Ensembl decides to do the probeset->gene (which is not trivial) mapping in order to understand why, but my guess is that they are in some sense stricter than Affymetrix. While this is not related to Ensembl, you might want to read this paper describing some of the problems with probe->probeset->gene mappings: http://nar.oxfordjournals.org/cgi/content/full/33/20/e175?ijkey=zaJMV7 qU1XANIci&keytype=ref Kasper On Mon, Oct 4, 2010 at 4:44 AM, Ren? Dreos <talponer at="" gmail.com=""> wrote: > Dear BioC mailing list, > > I am trying to annotate Arabidopsis ATH1 genome array results using biomaRt, > but it looks like some of the probesets are not annotated in biomaRt > database. Here is one example: > >> library(biomaRt) >> AT.db <- useMart(biomart="plant_mart_6", dataset="athaliana_eg_gene") >> getBM(attributes = c("affy_ath1_121501","ensembl_gene_id","description"), > filters = "affy_ath1_121501", values = "254998_at", mart = AT.db) > [1] affy_ath1_121501 ensembl_gene_id ?description > <0 rows> (or 0-length row.names) > > But if I use ath1121501.db library to annotate the same probeset: > >> library(annotate) >> library(ath1121501.db) > >> mget("254998_at", env=ath1121501GENENAME) > $`254998_at` > [1] "encodes a choline synthase whose gene expression is induced by high > salt and mannitol." > >> mget("254998_at", env=ath1121501ACCNUM) > $`254998_at` > [1] "AT4G09760" > > Why is this happening? > > Thank you for any advice, > best regards > r > >> sessionInfo() > R version 2.11.1 (2010-05-31) > x86_64-apple-darwin9.8.0 > > locale: > [1] C > > attached base packages: > [1] grid ? ? ?stats ? ? graphics ?grDevices utils ? ? datasets ?methods > [8] base > > other attached packages: > ?[1] ath1121501.db_2.4.1 ? ? ? org.At.tair.db_2.4.3 > ?[3] RSQLite_0.9-2 ? ? ? ? ? ? annotate_1.26.1 > ?[5] ath1121501cdf_2.6.0 ? ? ? biomaRt_2.4.0 > ?[7] genefilter_1.30.0 ? ? ? ? marray_1.26.0 > ?[9] gplots_2.8.0 ? ? ? ? ? ? ?caTools_1.10 > [11] bitops_1.0-4.1 ? ? ? ? ? ?gdata_2.7.2 > [13] gtools_2.6.2 ? ? ? ? ? ? ?bradiar1b520742cdf_1.24.0 > [15] arrayQualityMetrics_2.6.0 affyPLM_1.24.1 > [17] gcrma_2.20.0 ? ? ? ? ? ? ?preprocessCore_1.10.0 > [19] matchprobes_1.20.0 ? ? ? ?Biostrings_2.16.9 > [21] IRanges_1.6.15 ? ? ? ? ? ?AnnotationDbi_1.10.2 > [23] affxparser_1.20.0 ? ? ? ? makecdfenv_1.26.0 > [25] lattice_0.18-8 ? ? ? ? ? ?RMySQL_0.7-5 > [27] DBI_0.2-5 ? ? ? ? ? ? ? ? affy_1.26.1 > [29] Biobase_2.8.0 ? ? ? ? ? ? limma_3.4.4 > > loaded via a namespace (and not attached): > ?[1] RColorBrewer_1.0-2 ?RCurl_1.4-2 ? ? ? ? XML_3.1-1 > ?[4] affyio_1.16.0 ? ? ? beadarray_1.16.0 ? ?hwriter_1.2 > ?[7] latticeExtra_0.6-14 simpleaffy_2.24.0 ? splines_2.11.1 > [10] stats4_2.11.1 ? ? ? survival_2.35-8 ? ? tools_2.11.1 > [13] vsn_3.16.0 ? ? ? ? ?xtable_1.5-6 > > ? ? ? ?[[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENTlink written 9.0 years ago by Kasper Daniel Hansen6.4k
Answer: BioMaRt query
0
gravatar for James W. MacDonald
9.0 years ago by
United States
James W. MacDonald51k wrote:
Hi Rene, On 10/4/2010 4:44 AM, Ren? Dreos wrote: > Dear BioC mailing list, > > I am trying to annotate Arabidopsis ATH1 genome array results using biomaRt, > but it looks like some of the probesets are not annotated in biomaRt > database. Here is one example: > >> library(biomaRt) >> AT.db<- useMart(biomart="plant_mart_6", dataset="athaliana_eg_gene") >> getBM(attributes = c("affy_ath1_121501","ensembl_gene_id","description"), > filters = "affy_ath1_121501", values = "254998_at", mart = AT.db) > [1] affy_ath1_121501 ensembl_gene_id description > <0 rows> (or 0-length row.names) > > But if I use ath1121501.db library to annotate the same probeset: > >> library(annotate) >> library(ath1121501.db) > >> mget("254998_at", env=ath1121501GENENAME) > $`254998_at` > [1] "encodes a choline synthase whose gene expression is induced by high > salt and mannitol." > >> mget("254998_at", env=ath1121501ACCNUM) > $`254998_at` > [1] "AT4G09760" > > Why is this happening? Because you are querying two different data sources and have found an instance in which they are not consistent. This is a pretty common occurrence, given how fluid gene definitions are (and likely will be for some time). Best, Jim > > Thank you for any advice, > best regards > r > >> sessionInfo() > R version 2.11.1 (2010-05-31) > x86_64-apple-darwin9.8.0 > > locale: > [1] C > > attached base packages: > [1] grid stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] ath1121501.db_2.4.1 org.At.tair.db_2.4.3 > [3] RSQLite_0.9-2 annotate_1.26.1 > [5] ath1121501cdf_2.6.0 biomaRt_2.4.0 > [7] genefilter_1.30.0 marray_1.26.0 > [9] gplots_2.8.0 caTools_1.10 > [11] bitops_1.0-4.1 gdata_2.7.2 > [13] gtools_2.6.2 bradiar1b520742cdf_1.24.0 > [15] arrayQualityMetrics_2.6.0 affyPLM_1.24.1 > [17] gcrma_2.20.0 preprocessCore_1.10.0 > [19] matchprobes_1.20.0 Biostrings_2.16.9 > [21] IRanges_1.6.15 AnnotationDbi_1.10.2 > [23] affxparser_1.20.0 makecdfenv_1.26.0 > [25] lattice_0.18-8 RMySQL_0.7-5 > [27] DBI_0.2-5 affy_1.26.1 > [29] Biobase_2.8.0 limma_3.4.4 > > loaded via a namespace (and not attached): > [1] RColorBrewer_1.0-2 RCurl_1.4-2 XML_3.1-1 > [4] affyio_1.16.0 beadarray_1.16.0 hwriter_1.2 > [7] latticeExtra_0.6-14 simpleaffy_2.24.0 splines_2.11.1 > [10] stats4_2.11.1 survival_2.35-8 tools_2.11.1 > [13] vsn_3.16.0 xtable_1.5-6 > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
ADD COMMENTlink written 9.0 years ago by James W. MacDonald51k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 131 users visited in the last hour