I would like to retrieve the Biological Process Ancestors of some GO IDs, but I can't get it to work using the select()
method. The old mget()
method still works, but AFAIK using this method is not recommended anymore (because of the handling of multiple 'hits'). Am i doing something wrong, or does it indeed not work (and is this still intentional)?
Thanks!
Guido
> library(GO.db)
> # which annotation info can be retrieved?
> columns(GO.db)#mmm, only 4...?
[1] "DEFINITION" "GOID" "ONTOLOGY" "TERM"
>
> # Let's query for 2 of these 4.
>
> # some random GOIDs
> keys <- c("GO:0003677", "GO:0003899", "GO:0006351", "GO:0009507", "GO:0032549", "GO:0031047")
>
> # select does work, but is only able to retrieve few 'columns'?
> AnnotationDbi::select(GO.db, keys=keys, keytype="GOID", columns=c("TERM","ONTOLOGY") )
'select()' returned 1:1 mapping between keys and columns
GOID TERM ONTOLOGY
1 GO:0003677 DNA binding MF
2 GO:0003899 DNA-directed 5'-3' RNA polymerase activity MF
3 GO:0006351 transcription, DNA-templated BP
4 GO:0009507 chloroplast CC
5 GO:0032549 ribonucleoside binding MF
6 GO:0031047 gene silencing by RNA BP
>
> # ... but this doesn't work!
> AnnotationDbi::select(GO.db, keys=keys, keytype="GOID", columns=c("GOMFANCESTOR") )
Error in .testForValidCols(x, cols) :
Invalid columns: GOMFANCESTOR. Please use the columns method to see a listing of valid arguments.
>
> # Old way still does!
> mget(keys, GOMFANCESTOR, ifnotfound=NA)
$`GO:0003677`
[1] "GO:0003674" "GO:0003676" "GO:0005488" "GO:0097159" "GO:1901363"
[6] "all"
$`GO:0003899`
[1] "GO:0003674" "GO:0003824" "GO:0016740" "GO:0016772" "GO:0016779"
[6] "GO:0034062" "GO:0097747" "GO:0140098" "all"
$`GO:0006351`
[1] NA
$`GO:0009507`
[1] NA
$`GO:0032549`
[1] "GO:0001882" "GO:0003674" "GO:0005488" "GO:0036094" "GO:0097159"
[6] "GO:0097367" "GO:1901363" "all"
$`GO:0031047`
[1] NA
>
>
> sessionInfo()
R version 4.0.3 Patched (2020-12-21 r79668)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 17134)
Matrix products: default
Random number generation:
RNG: Mersenne-Twister
Normal: Inversion
Sample: Rounding
locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] clusterProfiler_3.18.0 GO.db_3.12.1 AnnotationDbi_1.52.0
[4] IRanges_2.24.1 S4Vectors_0.28.1 Biobase_2.50.0
[7] BiocGenerics_0.36.0
Thanks Martin for your elaborate answer! I certainly need to better study the details of the 'workflow' your provided. Yet, on one hand it is good to know of a 'modern' way of querying annotation databases, on the other hand it is assuring to note that the output of the old
mget()
function still provide correct results (at least forGOMFANCESTOR
).