Hello.
I'm trying three days now to figure out how to fix this issue with an automatic way but so far couldn't fix it.
Here is the story.
I'm trying to analyse a dataset that has been produced on Affymetrix GeneChip Human Genome U133 Plus 2.0 . After applying RMA algorithm , I used the following code to create annotation columns with GENE SYMBOL and ENTREZ ID within my expression data.frame.
probes=row.names(expressions)
Symbols = unlist(mget(probes, hgu133plus2SYMBOL, ifnotfound=NA))
Entrez_IDs = unlist(mget(probes, hgu133plus2ENTREZID, ifnotfound=NA))
expressions=cbind(probes,Symbols,Entrez_IDs,expressions)
After that I figured out that some probe ids (total 11734) have not any GENE SYMBOL / ENTREZ ID and that is caused, maybe due to the update of the annotation database. I search on net to find out an automatic solution and until now I found this site where I chose to convert Affy IDs --> Gene symbol. Then I supply to it a list of some unidentified probe ids and hit submit
1007_s_at
1294_at
1552283_s_at
1552388_at
1552401_a_at
1552411_at
1552412_a_at
1552449_a_at
1552563_a_at
1552607_at
The thing now is that for some of them, this tool , return two gene symbols and I don't know which to choose.
What do you guys do in such cases ?
Thank you.