I'm trying to match IDs from a GEOquery data set to annotation information within the package:
hugene10stprobeset.db
When using the select function it seems the probe ids are not matching correctly.
To test, I ran:
> ids <- head(keys(hugene10stprobeset.db, keytype="PROBEID")) > select(hugene10stprobeset.db, keys=ids, cols=c("SYMBOL","UNIGENE"),keytype="PROBEID") [1] PROBEID SYMBOL UNIGENE <0 rows> (or 0-length row.names)
I get 0 matches even though I took the key names from the db itself.
When I do this for an alternate db it works:
ids <- head(keys(hgu95av2.db, keytype="PROBEID")) select(hgu95av2.db, keys=ids, cols = c("SYMBOL","UNIGENE"),keytype="PROBEID") PROBEID SYMBOL UNIGENE 1 1000_at MAPK3 Hs.861 2 1001_at TIE1 Hs.78824 3 1002_f_at CYP2C19 Hs.282409 4 1003_s_at CXCR5 Hs.113916 5 1004_at CXCR5 Hs.113916 6 1005_at DUSP1 Hs.171695
Based on this I think the select function within hugene10stprobeset.db is behaving improperly.
> sessionInfo()
R version 2.14.0 (2011-10-31)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] annotate_1.32.3 hgu95av2.db_2.6.3
[3] hugene10stprobeset.db_8.0.1 hgu133a.db_2.6.3
[5] org.Hs.eg.db_2.6.4 RSQLite_0.11.4
[7] DBI_0.2-5 AnnotationDbi_1.16.19
[9] BiocInstaller_1.2.1 Biobase_2.14.0
loaded via a namespace (and not attached):
[1] IRanges_1.12.6 tools_2.14.0 xtable_1.7-1
I thought the
select()
contract is to return at least 1:1 mappings, sometimes 1:many, so the first query should return a 6x3 data.frame?Maybe the OP has a borked package. I get this:
But then I am not using R-2.14.0 either.