mouse4302
0
0
Entering edit mode
Nianhua Li ▴ 870
@nianhua-li-1606
Last seen 9.6 years ago
Hi, I received the following email from Lynn Amon and would like to answer it through the mailing list. mouse4302 was generated by using function ABPkgBuilder in package AnnBuilder. The strategy is to first map probeset ids to Entrez Gene IDs and then use Entrez Gene IDs to retrieve other annotations (e.g. symbol, refseq, pathway, go). Because 1415822_at, 1415823_at and 1415824_at were all mapped to Entrez Gene ID 20249 which corresponds to Scd1, so all of their annotations (e.g. symbol, refseq) corresponds to Scd1. So, the question goes to the mapping from probeset id to Entrez Gene ID. For mouse4302, we obtained the mapping in four ways: (1) get probeset to GenBank accession mapping from Affymetrix annotation, and then use ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene2accession.gz to map GenBank accession to EntrezGene ID (2) get probeset to GenBank accession mapping from Affymetrix annotation, and then use ftp://ftp.ncbi.nih.gov/repository/UniGene/Mus_musculus/Mm.data.gz to map GenBank accession to EntrezGene ID (3) get probeset to EntrezGene mapping directly from Affymetrix (4) get probeset to UniGene mapping from Affymetrix and then use ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene2accession.gz to map UniGene cluster to EntrezGene ID * note: Affymetrix annotation is dated on Dec 18, 2005, and the rest is on March 18, 2006. We treat the first two as "trust" sources, and the last two as supplimentary sources. So, the supplimentary sources won't be used unless all the "trust" sources have missing values for a probeset. No matter whether we use "trust" or "supplimentary" sources, if there is disagreement on the mapping of a probeset, we pick the value that is agreeed by most sources. If there is a tie, we will pick the first one on the list (i.e. arbitrarily). In the case of 1415822_at, we got 20249, 20250, 20250, 20250 from the above four methods respectively. (BTW, 1415822_at was mapped to GenBank acc BG060909 in Affymetrix's annotation). 20250 is the Entrez Gene record for Scd2, and 20249 is for Scd1. The value from "trusted" sources are 20249 and 20250. Because 20249 happens to be the frist one on the list, we picked it up. It seems the software picked the wrong value in this paticular example. But it might be a reasonal approach in general. I am not the expert. It will be appreciated if someone could comment on this. many thanks Nianhua Li computational biology, public health, FHCRC > > > ---------- Forwarded message ---------- > Date: Thu, 08 Jun 2006 07:13:33 -0700 > From: Lynn Amon <lynnamon at="" u.washington.edu=""> > To: Ting-Yuan Liu <tliu at="" fhcrc.org=""> > Subject: Re: annotation services > > Hello Ting, > I just loaded the newest version of mouse4302 from the Bioconductor 1.8 and it > is different than the previous version. By chance, I looked at the gene Scd1. > Previously, 1415965_at and 1415964_at were the only probe ids given for the gene > Scd1 which agrees with annotation given on the affy website and the chromosome > view on Ensembl. Now, in addition to those probes, 1415822_at, 1415823_at and > 1415824_at which were formerly annotated as Scd2 are given the symbol and refseq > ID for Scd1 which does not agree with affy or Ensembl. Is there a reason for > these changes? Should I expect to see many changes in this new annotation file? > Shouldn't this annotation file agree with the annotations given by affy? > Thanks for you help, > Lynn Amon > >
Annotation mouse4302 probe affy Annotation mouse4302 probe affy • 943 views
ADD COMMENT

Login before adding your answer.

Traffic: 864 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6