While you can use as.list to get the values, it's really not the way to go. You end up relying on old code that doesn't really do what you might think it does. Plus, why would you want a list in the first place?
As an example:
> z <- as.list(hugene10sttranscriptclusterSYMBOL[keys(hugene10sttranscriptcluster.db)])
> table(sapply(z, is.na))
FALSE TRUE
19985 13312
So using as.list returns 13312 probesets that appear not to have a HUGO symbol appended. But that's not quite right:
> zz <- mapIds(hugene10sttranscriptcluster.db, keys(hugene10sttranscriptcluster.db), "SYMBOL", "PROBEID")
'select()' returned 1:many mapping between keys and columns
> sum(is.na(zz))
[1] 10992
There are actually a bit over 2000 probeids that do have a HUGO symbol, for which as.list
is returning NA
. This is because those probeids have multiple symbols!
> ind <- sapply(z, is.na) & !is.na(zz)
> sum(ind)
[1] 2320
> head(select(hugene10sttranscriptcluster.db, names(zz)[ind], "SYMBOL"))
'select()' returned 1:many mapping between keys and columns
PROBEID SYMBOL
1 7896740 OR4F4
2 7896740 OR4F17
3 7896740 OR4F5
4 7896742 LINC00266-1
5 7896742 PCMTD2
6 7896742 LOC728323
The old style functions return NA
for any probeset ID that maps to more than one HUGO symbol (or whatever else you are mapping to), whereas mapIds
and select
do not. So it's a better idea to use the more current query methods.
In addition, assuming your data are in an ExpressionSet
, you can use annotateEset
from my affycoretools
package to do this in one line. Say your ExpressionSet
is called 'eset':
library(affycoretools)
eset <- annotateEset(eset, hugene10sttranscriptcluster.db)
And then model fitting tools like limma
will return fully annotated topTable
results.
Thank you :)