biomaRt retrieving only NAs for Oryza sativa gene descriptions/annotations
1
0
Entering edit mode
jennareeg • 0
@jennareeg-20535
Last seen 4.9 years ago

I am trying to retrieve Oryza sativa protein coding genes with their annotations. When running this code months ago, it worked fine. Now when I run it, I get a data frame with the list of gene IDs, chromosome names, start and stop positions, but only NAs in the description column. I need some help figuring out how to get the gene annotations again!

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("biomaRt", version = "3.8")

library("biomaRt")

rice_mart <- useMart("plants_mart", 
                           dataset="osativa_eg_gene", 
                           host="plants.ensembl.org")

rice_genes <- getBM(attributes = c("ensembl_gene_id",
                                "chromosome_name",
                                "start_position",
                                "end_position", 
                                "description"), 
                                filters = "biotype", 
                                values = "protein_coding", 
                                mart=rice_mart)

sessionInfo() R version 3.5.3 (2019-03-11) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: macOS Mojave 10.14.4

Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale: [1] enUS.UTF-8/enUS.UTF-8/enUS.UTF-8/C/enUS.UTF-8/en_US.UTF-8

attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets methods base

other attached packages: [1] biomaRt2.38.0 GenomicRanges1.34.0 GenomeInfoDb1.18.2 IRanges2.16.0 S4Vectors0.20.1
[6] BiocGenerics
0.28.0

loaded via a namespace (and not attached): [1] Rcpp1.0.1 BiocManager1.30.4 compiler3.5.3 XVector0.22.0
[5] prettyunits1.0.2 bitops1.0-6 tools3.5.3 progress1.2.0
[9] zlibbioc1.28.0 digest0.6.18 bit1.1-14 RSQLite2.1.1
[13] memoise1.1.0 pkgconfig2.0.2 rlang0.3.4 DBI1.0.0
[17] rstudioapi0.10 curl3.3 yaml2.2.0 xfun0.6
[21] GenomeInfoDbData1.2.0 stringr1.4.0 httr1.4.0 knitr1.22
[25] hms0.4.2 bit640.9-7 Biobase2.42.0 R62.4.0
[29] AnnotationDbi1.44.0 XML3.98-1.19 blob1.1.1 magrittr1.5
[33] assertthat0.2.1 stringi1.4.3 RCurl1.95-4.12 crayon1.3.4

annotation • 560 views
ADD COMMENT
1
Entering edit mode
Mike Smith ★ 6.5k
@mike-smith
Last seen 12 hours ago
EMBL Heidelberg

It looks to me like the description is blank if you use the web interface too e.g Ensembl Plants BioMart

If that's the case then you'll need to contact Ensembl to find out why the underlying dataset has changed. It looks like the rice genome was update in the last release, so maybe this is related

Details on genome updates: ... - Rice (Oryza sativa): updated gene annotation to RAP-DB version 2018-11-26 and added stable_id mappings to previous annotation

ADD COMMENT

Login before adding your answer.

Traffic: 615 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6