incorrect gene symbols in annotate
2
0
Entering edit mode
Anthony Bosco ▴ 500
@anthony-bosco-517
Last seen 9.7 years ago
Hi. I have come accross some errors when linking probe IDs with gene symbols. In most cases the probe ID retrieves the corredct gene symbol, however the following probe IDs should correspond to CD4 antigen, CD4 anitigen, and FCGR3A respectively. genes<-c("203547_at","216424_at","204006_s_at") symbol<-multiget(genes,env=hgu133aSYMBOL) symbol $"203547_at" [1] "C3F" $"216424_at" [1] NA $"204006_s_at" [1] "FCGR3B" Regards Anthony R session codes library(biobase) library(annotate) library(hgu133a) genes<-c("203547_at","216424_at","204006_s_at") symbol<-multiget(genes,env=hgu133aSYMBOL) -- ______________________________________________ Anthony Bosco - Cell Biology Research Assistant Institute for Child Health Research (Company Limited by Guarantee ACN 009 278 755) Subiaco, Western Australia, 6008 Ph 61 8 9489 , Fax 61 8 9489 7700 email anthonyb@ichr.uwa.edu.au ______________________________________________ [[alternative HTML version deleted]]
probe probe • 742 views
ADD COMMENT
0
Entering edit mode
John Zhang ★ 2.9k
@john-zhang-6
Last seen 9.7 years ago
You may get somewhat different results depending on the source you are comparing the mappings to and even the time when the comparisons are made. We try to keep the mappings updtated as frequently as we can. The link "MetaData/Annotation Packages" on Bioconductor web site contains a brief description of the building process of the annotation data packages and the vignettes "How to use AnnBuilder" and "Basic Functions of AnnBuilder" contain instructions on how to build an annotation data package. You may try to build your own annotation data package to make sure your annoataions are current. > >I have come accross some errors when linking probe IDs with gene symbols. > >In most cases the probe ID retrieves the corredct gene symbol, >however the following probe IDs should correspond to CD4 antigen, CD4 >anitigen, and FCGR3A respectively. > >genes<-c("203547_at","216424_at","204006_s_at") >symbol<-multiget(genes,env=hgu133aSYMBOL) > >symbol >$"203547_at" >[1] "C3F" > >$"216424_at" >[1] NA > >$"204006_s_at" >[1] "FCGR3B" > > > >Regards > > >Anthony > > > >R session codes > > >library(biobase) >library(annotate) >library(hgu133a) > >genes<-c("203547_at","216424_at","204006_s_at") >symbol<-multiget(genes,env=hgu133aSYMBOL) >-- >______________________________________________ > >Anthony Bosco - Cell Biology Research Assistant > >Institute for Child Health Research >(Company Limited by Guarantee ACN 009 278 755) >Subiaco, Western Australia, 6008 > >Ph 61 8 9489 , Fax 61 8 9489 7700 >email anthonyb@ichr.uwa.edu.au >______________________________________________ > [[alternative HTML version deleted]] > >_______________________________________________ >Bioconductor mailing list >Bioconductor@stat.math.ethz.ch >https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor Jianhua Zhang Department of Biostatistics Dana-Farber Cancer Institute 44 Binney Street Boston, MA 02115-6084
ADD COMMENT
0
Entering edit mode
rgentleman ★ 5.5k
@rgentleman-7725
Last seen 9.0 years ago
United States
To amplify a bit on Jianhua's explanation: the mapping between any two sets of identifiers can be problematic when both sides are subject to the constant evolution and improvment that presently exists for genomic data. There are many different strategies and folks need to pick one that satisfies their particular needs. We have decided on a process that we believe satisfies certain basic requirements (reproducibility being of primary importance). All Bioconductor metadata packages are produced in a well documented manner. The data sources and their version numbers (or dates of acquisition if the data are not versioned) are provided in the documentation for the package. This allows users to verify our mappings (but that is with respect to the data we have selected and the manner in which we have chosen to resolve conflicts that arise). Differences between our mappings and those available from other sources are not necessarily errors. They may indicate changes in knowledge between when our mapping was done and the current state. They may in fact represent errors and we take all reports such as this one seriously (but it would be helpful if some indication of why a person thinks there is an error, what their data source is etc was provided). We would especially welcome suggestions for reliable data sources and/or mappings that are needed that we do not presently supply. I doubt that it is possible to be concurrent with all data sources (and even if so, we certainly do not have those resources). I personally feel that not providing a well documented set of mappings and leaving researchers to search through the every changing labyrinth that is the reality of the web resources does them a great disservice. They can spend days trying to decide why the "same analysis" done at different times yielded different sets of genes only to find out that the web resource had changed between two successive queries. This lack of reproducibility seems to be very undesireable to me. We strive for reproducibility of the numerical results, we should do the same for the mappings. We build reasonably often (and can do so on demand), and provide documentation about how we built. We also archive all old versions so that users can assess how changes have impacted their previous mappings if desired. Robert On Fri, Nov 21, 2003 at 11:37:51AM -0500, John Zhang wrote: > You may get somewhat different results depending on the source you are comparing > the mappings to and even the time when the comparisons are made. We try to keep > the mappings updtated as frequently as we can. > > The link "MetaData/Annotation Packages" on Bioconductor web site contains a > brief description of the building process of the annotation data packages and > the vignettes "How to use AnnBuilder" and "Basic Functions of AnnBuilder" > contain instructions on how to build an annotation data package. You may try to > build your own annotation data package to make sure your annoataions are > current. > > > > > >I have come accross some errors when linking probe IDs with gene symbols. > > > >In most cases the probe ID retrieves the corredct gene symbol, > >however the following probe IDs should correspond to CD4 antigen, CD4 > >anitigen, and FCGR3A respectively. > > > >genes<-c("203547_at","216424_at","204006_s_at") > >symbol<-multiget(genes,env=hgu133aSYMBOL) > > > >symbol > >$"203547_at" > >[1] "C3F" > > > >$"216424_at" > >[1] NA > > > >$"204006_s_at" > >[1] "FCGR3B" > > > > > > > >Regards > > > > > >Anthony > > > > > > > >R session codes > > > > > >library(biobase) > >library(annotate) > >library(hgu133a) > > > >genes<-c("203547_at","216424_at","204006_s_at") > >symbol<-multiget(genes,env=hgu133aSYMBOL) > >-- > >______________________________________________ > > > >Anthony Bosco - Cell Biology Research Assistant > > > >Institute for Child Health Research > >(Company Limited by Guarantee ACN 009 278 755) > >Subiaco, Western Australia, 6008 > > > >Ph 61 8 9489 , Fax 61 8 9489 7700 > >email anthonyb@ichr.uwa.edu.au > >______________________________________________ > > [[alternative HTML version deleted]] > > > >_______________________________________________ > >Bioconductor mailing list > >Bioconductor@stat.math.ethz.ch > >https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor > > Jianhua Zhang > Department of Biostatistics > Dana-Farber Cancer Institute > 44 Binney Street > Boston, MA 02115-6084 > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor -- +--------------------------------------------------------------------- ------+ | Robert Gentleman phone : (617) 632-5250 | | Associate Professor fax: (617) 632-2444 | | Department of Biostatistics office: M1B20 | | Harvard School of Public Health email: rgentlem@jimmy.harvard.edu | +--------------------------------------------------------------------- ------+
ADD COMMENT

Login before adding your answer.

Traffic: 700 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6