Question: Filter out pombe probeset from cerevisiae probesets for yeast2 Affymetrix chip
11.4 years ago by
Guiyuan Lei • 90
Guiyuan Lei • 90 wrote:
Hi Jim, Thanks for suggestion. In order to get the gene names/symbols for cerevisiae probesets as much as possible, I donwloaded Yeast2 annotation from Affymetrix http://www.affymetrix.com/Auth/analysis/downloads/na24/ivt/Yeast_2.na2 4.annot.csv.zip Firstly, I found that Bioconductor have got more cerevisiae probesets named than what Affymetrix has. In Yeast2GENENAME (from Bioconductor), 4640 probesets out of 5900 probesets (after filter out 5028 pombe probesets which are in mask file s_cerevisiae.msk) have gene names while there are only 4557 probesets out of 5900 probesets (also after filter out 5028 pombe probesets which are labeled as "pombe" specie in Yeast_2.na24.annot.csv ) have gene symbols in Yeast_2.na24.annot.csv. The Yeast_2.na24.annot.csv I used is the latest file which was updated in November 2007. How could the Affymetrix have less information than third party (like Bioconductor)? Secondly, I found that the s_pombe.zip file from the following Affy web does NOT consist with its own annotation file (Yeast_2.na24.annot.csv mentioned above) http://www.affymetrix.com/Auth/support/downloads/mask_files/s_pombe.zi p There are 5814 probesets are labeled as "cerevisiae" in Yeast_2.na24.annot.csv, so I suppose there are at least 5814 probesets in s_pombe.msk in order to mask cerevisiae probesets, but there are only 5749 probesets in s_pombe.msk. In addtion, the probeset "177968_at" is not in the whole 10928 probesets of Yeast2 chip but is in s_pombe.msk!!! Best regards, Guiyuan On Nov 29, 2007 4:21 PM, James W. MacDonald <jmacdon at="" med.umich.edu=""> wrote: > Hi Guiyuan, > > Guiyuan Lei wrote: > > Hi Jim, > > > > Many thanks. I have checked the s_pombe.msk and s_cerevisiae.msk > > files, the overlap between pombe and cerevisiae are probesets which > > with prefix "AFFX" and "RPTR". One strange thing is that one probeset > > called "177968_at" is in s_pombe.msk but is NOT among the whole 10928 > > probesets! So the overlap are 152 probesets. > > > > I got one more question, for the Yeast2GENENAME, many probesets only > > have ID, but no genename (is "NA"), is it possible to get gene > > name/symbol for all 10928 probesets? > > You might check either netaffx or biomaRt, but if there are no gene > names for certain probesets in the annotation package that usually > indicates that the probesets in question interrogate things that have > yet to be named (e.g., ESTs, inferred genes, etc).
ADD COMMENT • link •