Missing a list of ensembl gene dataset for some species
1
0
Entering edit mode
ziqin • 0
@638f8732
Last seen 20 months ago
United States

Hi, Recently I've been using the bioMart package to get the ensembl gene dataset, I notice that by the function listDataset(), I could get the name of different ensembl dataset for different species, but the problem is that it seems that by this function I could only get 215 ensembl dataset for 215 different species in R, which in my dataset there are 63 species has been excluded from it, I wonder if I could find the ensembl dataset of them by other methods in R or not in R. Please tell me if there are any other method, I'll attach few species being excluded. Thanks for reading.

scientific_name: Accipiter nisus, Anas zonorhyncha, Anser cygnoides, Apteryx haastii, Apteryx owenii, Apteryx rowi, Bubo bubo, Cairina moschata, Calidris pugnax, ...

biomaRt • 643 views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 6 hours ago
United States

As an example,

> library(AnnotationHub)
> hub <- AnnotationHub()
  |======================================================================| 100%

snapshotDate(): 2022-04-25

> query(hub, c("Accipiter nisus","ensdb"))
AnnotationHub with 9 records
# snapshotDate(): 2022-04-25
# $dataprovider: Ensembl
# $species: Accipiter nisus
# $rdataclass: EnsDb
# additional mcols(): taxonomyid, genome, description,
#   coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
#   rdatapath, sourceurl, sourcetype 
# retrieve records with, e.g., 'object[["AH78714"]]' 

             title                                
  AH78714  | Ensembl 99 EnsDb for Accipiter nisus 
  AH79611  | Ensembl 100 EnsDb for Accipiter nisus
  AH83130  | Ensembl 101 EnsDb for Accipiter nisus
  AH89094  | Ensembl 102 EnsDb for Accipiter nisus
  AH89340  | Ensembl 103 EnsDb for Accipiter nisus
  AH95658  | Ensembl 104 EnsDb for Accipiter nisus
  AH97961  | Ensembl 105 EnsDb for Accipiter nisus
  AH100558 | Ensembl 106 EnsDb for Accipiter nisus
  AH104778 | Ensembl 107 EnsDb for Accipiter nisus

> Anisus.ensdb <- hub[["AH104778"]]
downloading 1 resources
retrieving 1 resource
  |======================================================================| 100%

loading from cache
require("ensembldb")
Warning message:
package 'GenomeInfoDb' was built under R version 4.2.1 
> Anisus.ensdb
EnsDb for Ensembl:
|Backend: SQLite
|Db type: EnsDb
|Type of Gene ID: Ensembl Gene ID
|Supporting package: ensembldb
|Db created by: ensembldb package from Bioconductor
|script_version: 0.3.7
|Creation time: Wed Jul 20 12:06:03 2022
|ensembl_version: 107
|ensembl_host: localhost
|Organism: Accipiter nisus
|taxonomy_id: 211598
|genome_build: Accipiter_nisus_ver1.0
|DBSCHEMAVERSION: 2.2
| No. of genes: 18087.
| No. of transcripts: 27904.
|Protein data available.

Mostly what will be available is positional data such as this rather than functional annotation such as mappings between a gene ID and a GO term. If you want those sort of data, you might look at makeOrgPackageFromNCBI from the AnnotationForge package.

ADD COMMENT

Login before adding your answer.

Traffic: 487 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6