I'm trying to match IDs from a GEOquery data set to annotation information within the package:
hugene10stprobeset.db
When using the select function it seems the probe ids are not matching correctly.
To test, I ran:
> ids <- head(keys(hugene10stprobeset.db, keytype="PROBEID"))
> select(hugene10stprobeset.db, keys=ids, cols=c("SYMBOL","UNIGENE"),keytype="PROBEID")
[1] PROBEID SYMBOL UNIGENE
<0 rows> (or 0-length row.names)
I get 0 matches even though I took the key names from the db itself.
When I do this for an alternate db it works:
ids <- head(keys(hgu95av2.db, keytype="PROBEID"))
select(hgu95av2.db, keys=ids, cols = c("SYMBOL","UNIGENE"),keytype="PROBEID")
PROBEID SYMBOL UNIGENE
1 1000_at MAPK3 Hs.861
2 1001_at TIE1 Hs.78824
3 1002_f_at CYP2C19 Hs.282409
4 1003_s_at CXCR5 Hs.113916
5 1004_at CXCR5 Hs.113916
6 1005_at DUSP1 Hs.171695
Based on this I think the select function within hugene10stprobeset.db is behaving improperly.
> sessionInfo()R version 2.14.0 (2011-10-31)Platform: x86_64-unknown-linux-gnu (64-bit)locale:[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8[7] LC_PAPER=C LC_NAME=C[9] LC_ADDRESS=C LC_TELEPHONE=C[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=Cattached base packages:[1] stats graphics grDevices utils datasets methods baseother attached packages:[1] annotate_1.32.3 hgu95av2.db_2.6.3[3] hugene10stprobeset.db_8.0.1 hgu133a.db_2.6.3[5] org.Hs.eg.db_2.6.4 RSQLite_0.11.4[7] DBI_0.2-5 AnnotationDbi_1.16.19[9] BiocInstaller_1.2.1 Biobase_2.14.0loaded via a namespace (and not attached):[1] IRanges_1.12.6 tools_2.14.0 xtable_1.7-1

I thought the
select()contract is to return at least 1:1 mappings, sometimes 1:many, so the first query should return a 6x3 data.frame?Maybe the OP has a borked package. I get this:
> ids <- head(keys(hugene10stprobeset.db, keytype="PROBEID")) > select(hugene10stprobeset.db, keys=ids, cols=c("SYMBOL","UNIGENE"),keytype="PROBEID") PROBEID SYMBOL UNIGENE 1 7892501 <NA> <NA> 2 7892502 <NA> <NA> 3 7892503 <NA> <NA> 4 7892504 <NA> <NA> 5 7892505 <NA> <NA> 6 7892506 <NA> <NA> Warning message: In .colsArgumentWarning() : The 'cols' argument has been deprecated and replaced by 'columns' for versions of Bioc that are higher than 2.13. Please use the 'columns' argument anywhere that you previously used 'cols'But then I am not using R-2.14.0 either.