Missing a list of ensembl gene dataset for some species
Entering edit mode
ziqin • 0
Last seen 17 months ago
United States

Hi, Recently I've been using the bioMart package to get the ensembl gene dataset, I notice that by the function listDataset(), I could get the name of different ensembl dataset for different species, but the problem is that it seems that by this function I could only get 215 ensembl dataset for 215 different species in R, which in my dataset there are 63 species has been excluded from it, I wonder if I could find the ensembl dataset of them by other methods in R or not in R. Please tell me if there are any other method, I'll attach few species being excluded. Thanks for reading.

scientific_name: Accipiter nisus, Anas zonorhyncha, Anser cygnoides, Apteryx haastii, Apteryx owenii, Apteryx rowi, Bubo bubo, Cairina moschata, Calidris pugnax, ...

biomaRt • 593 views
Entering edit mode
Last seen 8 hours ago
United States

As an example,

> library(AnnotationHub)
> hub <- AnnotationHub()
  |======================================================================| 100%

snapshotDate(): 2022-04-25

> query(hub, c("Accipiter nisus","ensdb"))
AnnotationHub with 9 records
# snapshotDate(): 2022-04-25
# $dataprovider: Ensembl
# $species: Accipiter nisus
# $rdataclass: EnsDb
# additional mcols(): taxonomyid, genome, description,
#   coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
#   rdatapath, sourceurl, sourcetype 
# retrieve records with, e.g., 'object[["AH78714"]]' 

  AH78714  | Ensembl 99 EnsDb for Accipiter nisus 
  AH79611  | Ensembl 100 EnsDb for Accipiter nisus
  AH83130  | Ensembl 101 EnsDb for Accipiter nisus
  AH89094  | Ensembl 102 EnsDb for Accipiter nisus
  AH89340  | Ensembl 103 EnsDb for Accipiter nisus
  AH95658  | Ensembl 104 EnsDb for Accipiter nisus
  AH97961  | Ensembl 105 EnsDb for Accipiter nisus
  AH100558 | Ensembl 106 EnsDb for Accipiter nisus
  AH104778 | Ensembl 107 EnsDb for Accipiter nisus

> Anisus.ensdb <- hub[["AH104778"]]
downloading 1 resources
retrieving 1 resource
  |======================================================================| 100%

loading from cache
Warning message:
package 'GenomeInfoDb' was built under R version 4.2.1 
> Anisus.ensdb
EnsDb for Ensembl:
|Backend: SQLite
|Db type: EnsDb
|Type of Gene ID: Ensembl Gene ID
|Supporting package: ensembldb
|Db created by: ensembldb package from Bioconductor
|script_version: 0.3.7
|Creation time: Wed Jul 20 12:06:03 2022
|ensembl_version: 107
|ensembl_host: localhost
|Organism: Accipiter nisus
|taxonomy_id: 211598
|genome_build: Accipiter_nisus_ver1.0
| No. of genes: 18087.
| No. of transcripts: 27904.
|Protein data available.

Mostly what will be available is positional data such as this rather than functional annotation such as mappings between a gene ID and a GO term. If you want those sort of data, you might look at makeOrgPackageFromNCBI from the AnnotationForge package.


Login before adding your answer.

Traffic: 469 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6