Search
Question: GenomicFeatures makeTxDbFromBiomart fails with "unkown species" error
3
gravatar for kaur.alasoo
21 months ago by
kaur.alasoo30
University of Tartu, Tartu, Estonia
kaur.alasoo30 wrote:

I tried to construct TxDb object from the latest version of Ensembl (v87):

txdb87 = makeTxDbFromBiomart( biomart = "ENSEMBL_MART_ENSEMBL", 
dataset = "hsapiens_gene_ensembl", host="dec2016.archive.ensembl.org")

But I got the following error:

Download and preprocess the 'transcripts' data frame ... OK
Download and preprocess the 'chrominfo' data frame ... OK
Download and preprocess the 'splicings' data frame ... OK
Download and preprocess the 'genes' data frame ... OK
Prepare the 'metadata' data frame ... Error in FUN(X[[i]], ...) : 
  1 unknown species: ‘Human genes’ Please use 'available.species' to see viable species names or tax Ids

Session info:

R version 3.3.1 (2016-06-21)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.12.2 (Sierra)

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] BiocInstaller_1.24.0   dplyr_0.5.0            biomaRt_2.30.0         GenomicFeatures_1.26.2 AnnotationDbi_1.36.2  
 [6] Biobase_2.34.0         GenomicRanges_1.26.2   GenomeInfoDb_1.10.3    IRanges_2.8.1          S4Vectors_0.12.1      
[11] BiocGenerics_0.20.0   

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.9                magrittr_1.5               XVector_0.14.0             zlibbioc_1.20.0            GenomicAlignments_1.10.0  
 [6] BiocParallel_1.8.1         R6_2.2.0                   tools_3.3.1                SummarizedExperiment_1.4.0 DBI_0.5-1                 
[11] lazyeval_0.2.0             assertthat_0.1             tibble_1.2                 rtracklayer_1.34.1         bitops_1.0-6              
[16] RCurl_1.95-4.8             RSQLite_1.1-2              Biostrings_2.42.1          Rsamtools_1.26.1           XML_3.98-1.5  

 

ADD COMMENTlink modified 21 months ago by James W. MacDonald48k • written 21 months ago by kaur.alasoo30
0
gravatar for James W. MacDonald
21 months ago by
United States
James W. MacDonald48k wrote:

Works for me:

> txdb87 = makeTxDbFromBiomart( biomart = "ENSEMBL_MART_ENSEMBL",
dataset = "hsapiens_gene_ensembl", host="dec2016.archive.ensembl.org")
txdb87 = makeTxDbFromBiomart( biomart = "ENSEMBL_MART_ENSEMBL",
+ dataset = "hsapiens_gene_ensembl", host="dec2016.archive.ensembl.org")
Download and preprocess the 'transcripts' data frame ... OK
Download and preprocess the 'chrominfo' data frame ... OK
Download and preprocess the 'splicings' data frame ... OK
Download and preprocess the 'genes' data frame ... OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... OK

> txdb87
TxDb object:
# Db type: TxDb
# Supporting package: GenomicFeatures
# Data source: BioMart
# Organism: Homo sapiens
# Taxonomy ID: 9606
# Resource URL: www.ensembl.org:80
# BioMart database: ENSEMBL_MART_ENSEMBL
# BioMart database version: Ensembl Genes 87
# BioMart dataset: hsapiens_gene_ensembl
# BioMart dataset description: hsapiens_gene_ensembl
# BioMart dataset version: GRCh38.p7
# Full dataset: yes
# miRBase build ID: NA
# transcript_nrow: 215929
# exon_nrow: 737982
# cds_nrow: 295719
# Db created by: GenomicFeatures package from Bioconductor
# Creation time: 2017-02-11 13:08:58 -0800 (Sat, 11 Feb 2017)
# GenomicFeatures version at creation time: 1.26.2
# RSQLite version at creation time: 1.1-2
# DBSCHEMAVERSION: 1.1

> sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 8 (jessie)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets
[8] methods   base     

other attached packages:
[1] GenomicFeatures_1.26.2 AnnotationDbi_1.36.1   Biobase_2.34.0        
[4] GenomicRanges_1.26.2   GenomeInfoDb_1.10.2    IRanges_2.8.1         
[7] S4Vectors_0.12.1       BiocGenerics_0.20.0    biomaRt_2.30.0        

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.9                XVector_0.14.0            
 [3] zlibbioc_1.20.0            GenomicAlignments_1.10.0  
 [5] BiocParallel_1.8.1         lattice_0.20-34           
 [7] tools_3.3.1                grid_3.3.1                
 [9] SummarizedExperiment_1.4.0 DBI_0.5-1                 
[11] digest_0.6.12              Matrix_1.2-8              
[13] rtracklayer_1.34.1         bitops_1.0-6              
[15] RCurl_1.95-4.8             memoise_1.0.0             
[17] RSQLite_1.1-2              compiler_3.3.1            
[19] Biostrings_2.42.1          Rsamtools_1.26.1          
[21] XML_3.98-1.5              
>

Maybe try again?

 

ADD COMMENTlink written 21 months ago by James W. MacDonald48k

Yes, you are right. Seems to be working now.

Thanks!

ADD REPLYlink written 21 months ago by kaur.alasoo30
1

I'm getting the same error?

Weirdly, it works if I use an archived 'host'="jul2016.archive.ensembl.org"

> CanFam.txdb <- makeTxDbFromBiomart(biomart = "ENSEMBL_MART_ENSEMBL",

+                                    dataset = "cfamiliaris_gene_ensembl",
+                                    host = "ensembl.org")
Download and preprocess the 'transcripts' data frame ... OK
Download and preprocess the 'chrominfo' data frame ... OK
Download and preprocess the 'splicings' data frame ... OK
Download and preprocess the 'genes' data frame ... OK
Prepare the 'metadata' data frame ... Error in FUN(X[[i]], ...) : 
  1 unknown species: ‘Dog genes’ Please use 'available.species' to see viable species names or tax Ids
> sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11.1 (El Capitan)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
 [1] grid      stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] Rsubread_1.22.3        VennDiagram_1.6.17     futile.logger_1.4.3    GenomicFeatures_1.24.5 AnnotationDbi_1.34.4   Biobase_2.32.0         biomaRt_2.28.0        
 [8] gridExtra_2.2.1        tidyr_0.6.1            knitr_1.15.1           DT_0.2                 RColorBrewer_1.1-2     ggplot2_2.2.1          BiocInstaller_1.22.3  
[15] GenomicRanges_1.24.3   GenomeInfoDb_1.8.7     IRanges_2.6.1          S4Vectors_0.10.3       BiocGenerics_0.18.0   

loaded via a namespace (and not attached):
 [1] SummarizedExperiment_1.2.3 colorspace_1.3-2           htmltools_0.3.5            rtracklayer_1.32.2         yaml_2.1.14                XML_3.98-1.5              
 [7] DBI_0.5-1                  BiocParallel_1.6.6         lambda.r_1.1.9             plyr_1.8.4                 stringr_1.1.0              zlibbioc_1.18.0           
[13] Biostrings_2.40.2          munsell_0.4.3              gtable_0.2.0               htmlwidgets_0.8            evaluate_0.10              memoise_1.0.0             
[19] labeling_0.3               highr_0.6                  Rcpp_0.12.9                scales_0.4.1               backports_1.0.5            jsonlite_1.2              
[25] XVector_0.12.1             Rsamtools_1.24.0           digest_0.6.12              stringi_1.1.2              dplyr_0.5.0                rprojroot_1.2             
[31] tools_3.3.1                bitops_1.0-6               magrittr_1.5               lazyeval_0.2.0             RCurl_1.95-4.8             tibble_1.2                
[37] RSQLite_1.1-2              futile.options_1.0.0       rsconnect_0.7              assertthat_0.1             rmarkdown_1.3              R6_2.2.0                  
[43] GenomicAlignments_1.8.4   
ADD REPLYlink modified 21 months ago • written 21 months ago by chitsazanalex10
2

Hi,

Maybe the problem is that you're not using the latest released version of Bioconductor (which is 3.4). Some fixes were applied recently to makeTxDbFromBiomart() in BioC 3.4 (and in BioC devel) to work around some issues introduced by some changes on the Ensembl Mart side.

Try to load the BiocInstaller package. You should see something like this:

> library(BiocInstaller)
Bioconductor version 3.3 (BiocInstaller 1.22.3), ?biocLite for help
A newer version of Bioconductor is available for this version of R,
  ?BiocUpgrade for help

I strongly suggest that you upgrade your installation to use BioC 3.4 so you get these fixes.

Cheers,

H.

ADD REPLYlink modified 21 months ago • written 21 months ago by Hervé Pagès ♦♦ 13k

This solved the problem, thank you

ADD REPLYlink written 21 months ago by chitsazanalex10

This error seems to be cropping up in GenomicFeatures:::.prepareBiomartMetadata, which is an internal function that isn't really intended for people to call directly. Regardless, we can call this function directly to see if we can get the error you see:

> mart <- useMart("ENSEMBL_MART_ENSEMBL", "cfamiliaris_gene_ensembl")
> GenomicFeatures:::.prepareBiomartMetadata(mart, TRUE, "ensembl.org", "80", "9615", "5")
Prepare the 'metadata' data frame ... OK
                          name                    value
1                  Data source                  BioMart
2                     Organism         Canis familiaris
3                  Taxonomy ID                     9615
4                 Resource URL       www.ensembl.org:80
5             BioMart database     ENSEMBL_MART_ENSEMBL
6     BioMart database version         Ensembl Genes 87
7              BioMart dataset cfamiliaris_gene_ensembl
8  BioMart dataset description cfamiliaris_gene_ensembl
9      BioMart dataset version                CanFam3.1
10                Full dataset                      yes
11            miRBase build ID                        5

And as before, I can't reproduce the error. What happens if you try to do this?

ADD REPLYlink written 21 months ago by James W. MacDonald48k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 398 users visited in the last hour