Annotation for Macaca fascicularis
2
0
Entering edit mode
Eva • 0
@eva-24832
Last seen 3.2 years ago

Hello

Looking for annotation for Macaca Fascicularis preferably in TxDb form. Does this exist? can it be added? Thanks

```

Macaca_fascicularis annotation • 1.8k views
ADD COMMENT
0
Entering edit mode
shepherl 3.8k
@lshep
Last seen 8 hours ago
United States

There are some in annotationhub

> ah = AnnotationHub()
snapshotDate(): 2021-01-14
> query(ah, c("Macaca", "fascicularis"))
AnnotationHub with 86 records
# snapshotDate(): 2021-01-14
# $dataprovider: Ensembl, ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/
# $species: Macaca fascicularis, macaca fascicularis
# $rdataclass: TwoBitFile, GRanges, EnsDb, OrgDb
# additional mcols(): taxonomyid, genome, description,
#   coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
#   rdatapath, sourceurl, sourcetype 
# retrieve records with, e.g., 'object[["AH60097"]]' 

            title                                                           
  AH60097 | Macaca_fascicularis.Macaca_fascicularis_5.0.91.abinitio.gtf     
  AH60098 | Macaca_fascicularis.Macaca_fascicularis_5.0.91.chr.gtf          
  AH60099 | Macaca_fascicularis.Macaca_fascicularis_5.0.91.gtf              
  AH60451 | Macaca_fascicularis.Macaca_fascicularis_5.0.cdna.all.2bit       
  AH60452 | Macaca_fascicularis.Macaca_fascicularis_5.0.dna_rm.toplevel.2bit
  ...       ...                                                             
  AH88390 | Macaca_fascicularis.Macaca_fascicularis_5.0.cdna.all.2bit       
  AH88391 | Macaca_fascicularis.Macaca_fascicularis_5.0.dna_rm.toplevel.2bit
  AH88392 | Macaca_fascicularis.Macaca_fascicularis_5.0.dna_sm.toplevel.2bit
  AH88393 | Macaca_fascicularis.Macaca_fascicularis_5.0.ncrna.2bit          
  AH89201 | Ensembl 102 EnsDb for Macaca fascicularis
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 7 hours ago
United States

Two choices.

> library(AnnotationHub)
> hub <- AnnotationHub()
  |======================================================================| 100%

snapshotDate(): 2020-10-27

> query(hub, c("fascicularis"))
AnnotationHub with 86 records
# snapshotDate(): 2020-10-27
# $dataprovider: Ensembl, ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/
# $species: Macaca fascicularis, macaca fascicularis
# $rdataclass: TwoBitFile, GRanges, EnsDb, OrgDb
# additional mcols(): taxonomyid, genome, description,
#   coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
#   rdatapath, sourceurl, sourcetype 
# retrieve records with, e.g., 'object[["AH60097"]]' 

            title                                                           
  AH60097 | Macaca_fascicularis.Macaca_fascicularis_5.0.91.abinitio.gtf     
  AH60098 | Macaca_fascicularis.Macaca_fascicularis_5.0.91.chr.gtf          
  AH60099 | Macaca_fascicularis.Macaca_fascicularis_5.0.91.gtf              
  AH60451 | Macaca_fascicularis.Macaca_fascicularis_5.0.cdna.all.2bit       
  AH60452 | Macaca_fascicularis.Macaca_fascicularis_5.0.dna_rm.toplevel.2bit
  ...       ...                                                             
  AH88390 | Macaca_fascicularis.Macaca_fascicularis_5.0.cdna.all.2bit       
  AH88391 | Macaca_fascicularis.Macaca_fascicularis_5.0.dna_rm.toplevel.2bit
  AH88392 | Macaca_fascicularis.Macaca_fascicularis_5.0.dna_sm.toplevel.2bit
  AH88393 | Macaca_fascicularis.Macaca_fascicularis_5.0.ncrna.2bit          
  AH89201 | Ensembl 102 EnsDb for Macaca fascicularis   

## last one there is the latest version from Ensembl       

> ensdb <- hub[["AH89201"]]
downloading 1 resources
retrieving 1 resource
  |======================================================================| 100%

loading from cache
require("ensembldb")

> ensdb
EnsDb for Ensembl:
|Backend: SQLite
|Db type: EnsDb
|Type of Gene ID: Ensembl Gene ID
|Supporting package: ensembldb
|Db created by: ensembldb package from Bioconductor
|script_version: 0.3.6
|Creation time: Sat Dec 19 15:27:39 2020
|ensembl_version: 102
|ensembl_host: localhost
|Organism: Macaca fascicularis
|taxonomy_id: 9541
|genome_build: Macaca_fascicularis_5.0
|DBSCHEMAVERSION: 2.1
| No. of genes: 29324.
| No. of transcripts: 54368.
|Protein data available.

See the vignette for ensembldb. They work +/- the same as a TxDb. Or if you prefer NCBI gene mappings,

> makeTxDbPackageFromUCSC("0.0.1", "me <me@mine.org>","me", genome = "macFas5", tablename="ncbiRefSeq", circ_seqs="chrM")
Download the ncbiRefSeq table ... OK
Extract the 'transcripts' data frame ... OK
Extract the 'splicings' data frame ... OK
Download and preprocess the 'chrominfo' data frame ... OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... OK
Creating package in ./TxDb.Mfascicularis.UCSC.macFas5.ncbiRefSeq 
TxDb object:
# Db type: TxDb
# Supporting package: GenomicFeatures
# Data source: UCSC
# Genome: macFas5
# Organism: Macaca fascicularis
# Taxonomy ID: 9541
# UCSC Table: ncbiRefSeq
# UCSC Track: NCBI RefSeq
# Resource URL: http://genome.ucsc.edu/
# Type of Gene ID: no gene ids
# Full dataset: yes
# miRBase build ID: NA
# Nb of transcripts: 76196
# Db created by: GenomicFeatures package from Bioconductor
# Creation time: 2021-02-19 12:35:20 -0500 (Fri, 19 Feb 2021)
# GenomicFeatures version at creation time: 1.42.1
# RSQLite version at creation time: 2.2.1
# DBSCHEMAVERSION: 1.2
Warning message:
In .extract_cds_locs_from_UCSC_txtable(ucsc_txtable) :
  UCSC data anomaly in 143 transcript(s): the cds cumulative length is
  not a multiple of 3 for transcripts 'NM_001283689.1' 'NM_001285211.1'
  'XM_015444012.1' 'XM_015444059.1' 'XM_015457323.1' 'NM_001283504.1'
  'XM_005570961.2' 'XM_005571037.2' 'XM_015431068.1' 'XM_005572258.2'
  'XM_015431697.1' 'XM_005595219.2' 'XM_005595222.2' 'NM_001283842.1'
  'XM_015432569.1' 'NM_001284577.1' 'NM_001283244.1' 'NM_001284707.1'
  'XM_015433684.1' 'XM_005575944.2' 'XM_015433907.1' 'NM_001284894.1'
  'NM_001284919.1' 'NM_001289964.1' 'NM_001283655.1' 'XM_015435450.1'
  'NM_001283412.1' 'XM_015435829.1' 'XM_015444433.1' 'XM_015436239.1'
  'NM_001283810.1' 'NM_001284668.1' 'NM_001283404.1' 'NM_001283177.1'
  'NM_001284890.1' 'XM_015438219.1' 'XM_015438390.1' 'NM_001284986.1'
  'XM_005585962.2' 'NM_001284083.1' 'NM_001283746.1' 'XM_005587749.2'
  'NM_001283504.1' 'XM_015440813.1' 'XM_015440814.1' 'XM_015441185.1'
  'XM_005595313.2' 'XM_015444470.1' 'XM_015444479.1' 'XM_015444482.1'
  'XM_005595318.2'  [... truncated]

## install it. I am on Windows so have to specify the type.

> install.packages("TxDb.Mfascicularis.UCSC.macFas5.ncbiRefSeq", repos = NULL, type = "source")
Installing package into 'C:/Users/jmacdon/AppData/Roaming/R/win-library/4.0'
(as 'lib' is unspecified)
* installing *source* package 'TxDb.Mfascicularis.UCSC.macFas5.ncbiRefSeq' ...
** using staged installation
** R
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
  converting help for package 'TxDb.Mfascicularis.UCSC.macFas5.ncbiRefSeq'
    finding HTML links ... done
    package                                 html  
** building package indices
** testing if installed package can be loaded from temporary location
*** arch - i386
*** arch - x64
** testing if installed package can be loaded from final location
*** arch - i386
*** arch - x64
** testing if installed package keeps a record of temporary installation path
* DONE (TxDb.Mfascicularis.UCSC.macFas5.ncbiRefSeq)

> library(TxDb.Mfascicularis.UCSC.macFas5.ncbiRefSeq)

> TxDb.Mfascicularis.UCSC.macFas5.ncbiRefSeq
TxDb object:
# Db type: TxDb
# Supporting package: GenomicFeatures
# Data source: UCSC
# Genome: macFas5
# Organism: Macaca fascicularis
# Taxonomy ID: 9541
# UCSC Table: ncbiRefSeq
# UCSC Track: NCBI RefSeq
# Resource URL: http://genome.ucsc.edu/
# Type of Gene ID: no gene ids
# Full dataset: yes
# miRBase build ID: NA
# Nb of transcripts: 76196
# Db created by: GenomicFeatures package from Bioconductor
# Creation time: 2021-02-19 12:35:20 -0500 (Fri, 19 Feb 2021)
# GenomicFeatures version at creation time: 1.42.1
# RSQLite version at creation time: 2.2.1
# DBSCHEMAVERSION: 1.2
ADD COMMENT

Login before adding your answer.

Traffic: 672 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6