Proper use of annotation package
2
0
Entering edit mode
@kamila-naxerova-4164
Last seen 9.6 years ago
Hi all, I have a little problem using an annotation package that I know how to work around, but I am wondering how to do it more elegantly and efficiently. I am analyzing a bunch of Mouse Gene 2.0 ST arrays. I built my own annotation package (with much help from all of you!). I am using Limma and want to look up annotation for diff exp genes provided by topTable(). So it's really a standard situation. On the Bioconductor website, this sequence of commands is suggested (http://www.bioconductor.org/help/workflows/annotation-data/) tbl <- topTable(efit, coef=2) ids <- tbl[["ID"]] entrez <- hgu95av2ENTREZID[ids] Looks beautiful! But when I try to do the same thing, I get: tbl<-topTable(fit2all,number=100) ids <- tbl[["ID"]] mogene20sttranscriptclusterACCNUM[ids] Error in .checkKeys(value, Lkeys(x), x at ifnotfound) : value for "17549282" not found This error is evidently generated because some of the ids don't map to any accession numbers. I can work around this by filtering my ids first, but am I doing it wrong? Of course lots of probe ids on the array are not going to map to any accession numbers or symbols or names -- why can't they just come back with NA instead of an error message and abortion of the whole process? Thanks! Kamila
Annotation probe Annotation probe • 1.0k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 10 hours ago
United States
Hi Kamila, You could use the select() method instead; select(mogene20sttranscriptcluster.db, ids, "ACCNUM") Which will return NA appropriately. Best, Jim On Mar 18, 2013 2:48 PM, "Naxerova, Kamila" <naxerova@fas.harvard.edu> wrote: > Hi all, > > I have a little problem using an annotation package that I know how to > work around, but I am wondering how to do it more elegantly and efficiently. > > I am analyzing a bunch of Mouse Gene 2.0 ST arrays. I built my own > annotation package (with much help from all of you!). I am using Limma and > want to look up annotation for diff exp genes provided by topTable(). So > it's really a standard situation. On the Bioconductor website, this > sequence of commands is suggested ( > http://www.bioconductor.org/help/workflows/annotation-data/) > > tbl <- topTable(efit, coef=2) > ids <- tbl[["ID"]] > entrez <- hgu95av2ENTREZID[ids] > > Looks beautiful! But when I try to do the same thing, I get: > > tbl<-topTable(fit2all,number=100) > ids <- tbl[["ID"]] > mogene20sttranscriptclusterACCNUM[ids] > > Error in .checkKeys(value, Lkeys(x), x@ifnotfound) : > value for "17549282" not found > > > This error is evidently generated because some of the ids don't map to any > accession numbers. I can work around this by filtering my ids first, but am > I doing it wrong? Of course lots of probe ids on the array are not going to > map to any accession numbers or symbols or names -- why can't they just > come back with NA instead of an error message and abortion of the whole > process? > > Thanks! > Kamila > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
@kamila-naxerova-4164
Last seen 9.6 years ago
Thanks for your prompt reply Jim! Actually I am afraid there is some deeper lack of knowledge on my part here... Some of the ids that topTable() returns simply don't exist in the annotation package, period. They also don't exist when I search for them in the NetAffx Analysis center. What are these? What is, e.g., "17549282"? After reading my cel files and normalizing them, my eset looks like this: > dim(eset) Features Samples 41345 12 The Affy Transcript cluster file only has about 39400 entries. What are these 2000 ids that are not overlapping? And some of them clearly are differentially expressed... Kamila On Mar 18, 2013, at 4:46 PM, "Naxerova, Kamila" <naxerova at="" fas.harvard.edu=""> wrote: > Hi all, > > I have a little problem using an annotation package that I know how to work around, but I am wondering how to do it more elegantly and efficiently. > > I am analyzing a bunch of Mouse Gene 2.0 ST arrays. I built my own annotation package (with much help from all of you!). I am using Limma and want to look up annotation for diff exp genes provided by topTable(). So it's really a standard situation. On the Bioconductor website, this sequence of commands is suggested (http://www.bioconductor.org/help/workflows/annotation-data/) > > tbl <- topTable(efit, coef=2) > ids <- tbl[["ID"]] > entrez <- hgu95av2ENTREZID[ids] > > Looks beautiful! But when I try to do the same thing, I get: > > tbl<-topTable(fit2all,number=100) > ids <- tbl[["ID"]] > mogene20sttranscriptclusterACCNUM[ids] > > Error in .checkKeys(value, Lkeys(x), x at ifnotfound) : > value for "17549282" not found > > > This error is evidently generated because some of the ids don't map to any accession numbers. I can work around this by filtering my ids first, but am I doing it wrong? Of course lots of probe ids on the array are not going to map to any accession numbers or symbols or names -- why can't they just come back with NA instead of an error message and abortion of the whole process? > > Thanks! > Kamila > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
Hi Kamila, The probeset you note is found by netaffx, and it is a 'rescue' probeset. This has a special use, and will not have an annotation associated with it. I generally remove all non-main probe sets after the eBayes() step. You can use the getMainProbes() function in the affycoretools package to accomplish that task. Best, Jim [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 680 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6