Anotation is not an equivalence (question regarding mouse4302.db and simmilar packages) - correction
1
0
Entering edit mode
@vojtech-kulvait-4537
Last seen 9.8 years ago
Hello, first of all, thank you for great effort you made to develop Bioconductor packages. I have a question regarding annotations. I am working now with mouse4302.db database (Mouse Genome 430 2.0 Array I guess). First of all I wanted to find all the probes matching single gene. I?thought?if I will find all alliasses to the gene?name and then matching probes, it will be OK and then I can somehow sum expression values by gene (if there is some function or package to do so, please let me know). What I found was, that there is not only one official symbol to each gene. Threre should be two different official symbols for single?gene. On the top of it, the relation?which spans probes by aliases or gene names is not equivalence. I mean if there is relation for two probes "to be marked with the allias" or "to be marked by the gene" I would expect when?for example when?? 1439356_at? 1433512_at are marked with "Fli1" allias 1439356_at 1448189_a_at are market with "3632430F08Rik" allias I would now expect that 1433512_at is marked by? "3632430F08Rik" alias also but it is not. More exactly, I would expect that "to be marked with a symbol or alias" will be the?same for all aliases but it is not and I dont know it is a mistake in anotation?process or if there is some deeper biological explanation for that. I am attaching code sample?for testing this. Please run geneInfo("Fli1") to test. With kind regards Vojtech?Kulvait. geneInfo
• 670 views
ADD COMMENT
0
Entering edit mode
Marc Carlson ★ 7.2k
@marc-carlson-2264
Last seen 7.9 years ago
United States
Hi Vojtech, I am having a little trouble understanding your post, so here is my 1st attempt: The 1st thing to notice is that these 4 probes don't map to the same gene. mget(c("1439356_at","1433512_at","1439356_at","1448189_a_at"), mouse4302ENTREZID) Notice that the 2nd probeset maps to a different gene (one that is on chromosome 9 instead of 11). We should therefore expect different gene symbols for this as we explore further. Now I can see what "official" gene symbol the each map to by doing this: mget(c("1439356_at","1433512_at","1439356_at","1448189_a_at"), mouse4302SYMBOL) And I can see what alternate gene symbol "aliases" they might each map to by doing this: mget(c("1439356_at","1433512_at","1439356_at","1448189_a_at"), revmap( mouse4302ALIAS2PROBE)) And now we can see why gene symbols are such a very poor way to identify genes. You can see that even though there are two different entrez gene IDs represented here, they still share (as an alias) the gene symbol Fli1 ! That is the kind of lousy thing that happens with gene symbols, which is why I always encourage people to use a "real" identifier (meaning it will actually be unique!) such as an entrez gene IDs instead. I hope this helps, Marc On 03/08/2011 01:50 AM, Vojtech Kulvait wrote: > Hello, > first of all, thank you for great effort you made to develop > Bioconductor packages. > > I have a question regarding annotations. I am working now with > mouse4302.db database (Mouse Genome 430 2.0 Array I guess). First of > all I wanted to find all the probes matching single gene. I thought if > I will find all alliasses to the gene name and then matching probes, > it will be OK and then I can somehow sum expression values by gene (if > there is some function or package to do so, please let me know). > > What I found was, that there is not only one official symbol to each > gene. Threre should be two different official symbols for single gene. > On the top of it, the relation which spans probes by aliases or gene > names is not equivalence. I mean if there is relation for two probes > "to be marked with the allias" or "to be marked by the gene" I would > expect when for example when > > 1439356_at 1433512_at are marked with "Fli1" allias > > > > 1439356_at 1448189_a_at are market with "3632430F08Rik" allias > > I would now expect that 1433512_at is marked by "3632430F08Rik" alias > also but it is not. > > More exactly, I would expect that "to be marked with a symbol or > alias" will be the same for all aliases but it is not and I dont know > it is a mistake in anotation process or if there is some deeper > biological explanation for that. > > I am attaching code sample for testing this. Please run > geneInfo("Fli1") to test. > > With kind regards > Vojtech Kulvait. > > geneInfo > > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]
ADD COMMENT

Login before adding your answer.

Traffic: 553 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6