[Fwd: Reporting problem with annotation in biomaRt, Illumina arrays - correction]
0
0
Entering edit mode
Pan Du ★ 1.2k
@pan-du-2010
Last seen 9.6 years ago
Hi Nenad and Marc, Sorry for missing this discussion. What kept in the lumiHumanIDMapping.db package is basically the Illumina manifest files of different Illumina chips. If the problem exists, then it is the problem of Illumina manifest files. No relation with the package itself. Illumina changed their IDs for several times and they are not compatible with each other. I am not sure what type Illumina ID is used in biomaRt. For the early version (verion 1) of the Illumina Chips, their probe Ids are pure numbers. Later on they changed the IDs in the form of "ILMN_xxx". Illumina also provided Gene IDs (previous called Target IDs). All of these caused lots of confusing and difficulty in combining data. That's the reason we invented nuID (which is based probe sequence and is globally unique) to avoid all of these problems. The lumiHumanIDMapping.db is provided for the convenience to conversion between different types of IDs. Mainly designed for the conversion between nuIDs and Illumina IDs. Users can also use them for conversion between Illumina IDs by writing simply script by themselves. Pan On 2/18/09 10:33 AM, "Marc Carlson" <mcarlson at="" fhcrc.org=""> wrote: > From: Nenad Bartonicek <nbartonicek at="" gmail.com=""> > Date: Wed, 18 Feb 2009 09:33:46 +0000 > To: <bioconductor at="" stat.math.ethz.ch=""> > Subject: [BioC] Reporting problem with annotation in biomaRt, Illumina arrays > - correction > > Dear all, > > My apologies, the annotation problem was not with biomaRt, but with > the prepackaged datasets: > > 1. lumiHumanIDMapping.db and > 2. lumiMouseIDMapping.db. > > The description of the problem remains the same, though. > > Gioulietta and Wolfgang, thank you for the prompt reply. > > Regards, > > Nenad > > p.s. The missing sessionInfo(): > > R version 2.8.0 (2008-10-20) > i386-apple-darwin8.11.1 > > locale: > en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8 > > attached base packages: > [1] stats graphics grDevices datasets tools utils methods > [8] base > > other attached packages: > [1] lumiMouseIDMapping.db_1.0.0 lumiHumanIDMapping.db_1.0.0 > [3] lumi_1.8.3 RSQLite_0.7-1 > [5] preprocessCore_1.4.0 mgcv_1.4-1.1 > [7] affy_1.20.2 annotate_1.20.1 > [9] xtable_1.5-4 AnnotationDbi_1.4.2 > [11] RMySQL_0.7-2 DBI_0.2-4 > [13] biomaRt_1.16.0 R.utils_1.1.1 > [15] R.oo_1.4.6 R.methodsS3_1.0.3 > [17] Biobase_2.2.1 > > loaded via a namespace (and not attached): > [1] RCurl_0.94-0 XML_1.99-0 affyio_1.10.1 > > > > >>> Hi Nenad, >>> >>> I had a look at our BioMart interface- the web interface at: >>> >>> www.ensembl.org/biomart/martview >>> >>> It appears to me that the Illumina V1 probe set attribute for mouse >>> gives the correct probe names. I believe the Illumina V1 set for >>> mouse >>> is the same as MouseWG6_V1. >>> >>> If you give me more details (i.e. which genes you are looking at, or >>> which filters you applied) I can give this another try. At first >>> glance, it doesn't look like an Ensembl data problem. >>> >>> Regards, >>> Giulietta (Ensembl Helpdesk) >>> >>> >>> On Tue Feb 17 13:35:57 2009, huber at ebi.ac.uk wrote: >>>> Hi Nenad >>>> >>>> thank you for reporting this! >>>> >>>> Since your question raises a more general operational question with >>>> biomaRt, I'd like to use the opportunity to explain, to this list, >>>> what's going on (not quite) behind the scenes. There are three >>>> separate >>>> organisations involved in this information chain: >>>> >>>> 1. The Ensembl database team (in Cambridge UK) >>>> >>>> 2. The BioMart software developers (in Toronto CA) and Rhoda >>>> Kinsella >>>> (in Cambridge) who imports the Ensembl data into the BioMart system >>>> >>>> 3. Bioconductor and specifically the biomaRt R package, which is >>>> simply >>>> a thin interface from R to a webservice, with no own content or >>>> logic >>>> (maintained Steffen Durinck in sunny Berkeley.) >>>> >>>> Questions at levels 2 and 3 are good to ask on this list and are >>>> usually >>>> efficiently answered e.g. by Steffen or Rhoda. >>>> >>>> What you report is, afaIct, an Ensembl data content problem, i.e. >>>> level >>>> 1. Here the advise is to email the Ensembl help desk: >>>> helpdesk at ensembl.org >>>> >>>> I hope this helps, please let us know if you have any more questions >>>> or >>>> observations. >>>> >>>> Best wishes >>>> Wolfgang >>>> >>>> ---------------------------------------------------- >>>> Wolfgang Huber, EMBL-EBI, http://www.ebi.ac.uk/huber >>>> >>>> >>>> -------- Original Message -------- >>>> Subject: [BioC] Reporting problem with annotation in biomaRt, >>>> Illumina >>>> arrays >>>> Date: Tue, 17 Feb 2009 12:02:37 +0000 >>>> From: Nenad Bartonicek <nenad at="" ebi.ac.uk=""> >>>> To: bioconductor at stat.math.ethz.ch >>>> >>>> Dear all, >>>> >>>> There seems to be a problem with probe annotation of certain >>>> Illumina >>>> arrays in biomaRt. >>>> >>>> The following arrays: HumanWG6_V1, HumanRef8_V1, MouseWG6_V1, >>>> MouseWG6_V1_B do not have valid Illumina probe names under the >>>> "ProbeId" column. >>>> They seem to contain values which are in the column >>>> "Array_Address_Id", which is the one next to the Probe_id column in >>>> the official Illumina flat files. >>>> >>>> For example. the array "MouseWG6_V1" >>>> >>>> library(lumiMouseIDMapping.db) >>>> dbconn=lumiMouseIDMapping_dbconn() >>>> tableNames=dbListTables(lumiMouseIDMapping_dbconn()) >>>> tableNames = tableNames[grep("Mouse",tableNames)] >>>> tableNames >>>> data = dbReadTable(dbconn,"MouseWG6_V1") >>>> head(data) >>>> >>>> The column ProbeId contains identifier "105290026" which is in the >>>> flat file on >>>> http://www.switchtoi.com/pdf/Annotation%20Files/Mouse/MouseWG- >>>> 6_V1_1_R4_11234304_A.zip >>>> under the column Array_Address_Id and has a proper identifier of >>>> "ILMN_1229450". >>>> >>>> Hope this helps and that it might be corrected sometime in the >>>> future, >>>> >>>> Nenad >>>> >>>> Nenad Bartonicek >>>> EMBL- European Bioinfromatics Institute >>>> Wellcome Trust Genome Campus >>>> Hinxton, Cambridge >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at stat.math.ethz.ch >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>> >>>> > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > ------------------------------------------------------ Pan Du, PhD Research Assistant Professor Northwestern University Biomedical Informatics Center 750 N. Lake Shore Drive, 11-176 Chicago, IL 60611 Office (312) 503-2360; Fax: (312) 503-5388 dupan (at) northwestern.edu
Annotation probe biomaRt Annotation probe biomaRt • 788 views
ADD COMMENT

Login before adding your answer.

Traffic: 727 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6