how to find ensembl genes belonging to a GO category?
Entering edit mode
Guido Hooiveld ★ 4.0k
Last seen 3 days ago
Wageningen University, Wageningen, the …


I have a list of Gene Ontology IDs, and I would like to extract the genes (ensembl IDs) that belong to these GO terms. To do this I would like to use the biomaRt package. However, I cannot get it to work; i get an error indicating an invalid filter is used. This also happens when running an example from the vignette.... Apparently 'go_id' nor 'go' are good names for the GO filter anymore....? I would appreciate some assistance with this.

Thanks, Guido


ensembl = useMart("ensembl",dataset="hsapiens_gene_ensembl")

# example from vignette:

getBM(attributes= "hgnc_symbol",
        values=list(go, chrom), mart=ensembl)

Error in getBM(attributes = "hgnc_symbol", filters = c("go_id", "chromosome_name"),  :
  Invalid filters(s): go_id
Please use the function 'listFilters' to get valid filter names

# listFilters finds this gene ontology idetifier, but doesn't work either...:

> listFilters(ensembl)[82,]
   name                description
82   go GO ID(s) [e.g. GO:0000002]



> sessionInfo()
R version 3.5.1 Patched (2018-08-13 r75130)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] biomaRt_2.36.1

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.18         AnnotationDbi_1.42.1 magrittr_1.5         BiocGenerics_0.26.0
 [5] hms_0.4.2            progress_1.2.0       IRanges_2.14.12      bit_1.1-14          
 [9] R6_2.2.2             rlang_0.2.2          httr_1.3.1           stringr_1.3.1       
[13] blob_1.1.1           tools_3.5.1          parallel_3.5.1       Biobase_2.40.0      
[17] DBI_1.0.0            bit64_0.9-7          digest_0.6.17        assertthat_0.2.0    
[21] crayon_1.3.4         S4Vectors_0.18.3     bitops_1.0-6         RCurl_1.95-4.11     
[25] memoise_1.1.0        RSQLite_2.1.1        stringi_1.2.4        compiler_3.5.1      
[29] prettyunits_1.0.2    stats4_3.5.1         XML_3.98-1.16        pkgconfig_2.0.2     


biomart gene ontology GO • 3.7k views
Entering edit mode
Last seen 1 day ago
United States
> getBM(c("go_id","ensembl_gene_id"), "go", go, mart)
          go_id ensembl_gene_id
1    GO:0003723 ENSG00000206557
2    GO:0005737 ENSG00000206557
3    GO:0008270 ENSG00000206557
4    GO:0046872 ENSG00000206557
5    GO:0016740 ENSG00000206557


But then what you say you want to do (get Ensembl Gene IDs for GO terms) and what your code is trying to do (get HUGO gene symbols based on a weird combination of GO ID and chromosome?) are not the same thing. Howeva,

> getBM(c("go_id","ensembl_gene_id","hgnc_symbol"), c("go", "chromosome_name"), list(go, chrom), mart)
         go_id ensembl_gene_id hgnc_symbol
1   GO:0005634 ENSG00000108443     RPS6KB1
2   GO:0005737 ENSG00000108443     RPS6KB1
3   GO:0016020 ENSG00000108443     RPS6KB1
4   GO:0000166 ENSG00000108443     RPS6KB1
5   GO:0004672 ENSG00000108443     RPS6KB1
6   GO:0004674 ENSG00000108443     RPS6KB1
7   GO:0005524 ENSG00000108443     RPS6KB1
Entering edit mode

In fairness to the OP the example does state it comes from the biomaRt vignette & fails in the same manner.   I've updated the devel vignette with the correct filter name now.

Entering edit mode

Thanks for the pointers; got it working.


Login before adding your answer.

Traffic: 495 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6