Search
Question: goseq package: Is the package in sync with the most current Ensembl release?
0
2.4 years ago by
University of Illinois, Urbana-Champaign
mrodrigues.fernanda10 wrote:

Hi! Does the goseq package always use the most current release of an Ensembl genome or is it possible it may be using previous archives?

How often is it updated? Is there anywhere I can obtain this information (through R or Bioconductor webpage)?

Thank you!!!

modified 2.4 years ago by Johannes Rainer1.3k • written 2.4 years ago by mrodrigues.fernanda10
1
2.4 years ago by
Australia

Hi,

The answer to this is a bit complicated as goseq depends on a few different databases. For the gene length information it will fetch data from either the geneLenDataBase package or TxDB annotation package or download from UCSC on the fly. GeneLenDataBase is not updated, so the Ensembl release used there will be old. Goseq also uses the org annotation packages (e.g. org.Hs.eg.db) to get the mapping between genes and go terms. I'm not sure how frequently the Ensembl versions are updated for that. If you want to ensure the latest version, you could use biomart or other sources (e.g. lengths from featureCounts or other summarisation tool) to get the length and GO mappings and specify these manually to goseq.

Cheers,

1
2.4 years ago by
Johannes Rainer1.3k
Italy
Johannes Rainer1.3k wrote:

In case you need the lengths for Ensembl genes; the EnsDb.Hsapiens.v75 package provides Ensembl gene annotations for release 75, so it's not the newest, but it should be easy enough to generate EnsDb databases/packages for other, newer, Ensembl releases using their GTF files (check the ensembldb package).

library(ensembldb)
library(EnsDb.Hsapiens.v75)

## Get the gene lengths
geneL <- lengthOf(EnsDb.Hsapiens.v75, of="gene")

## Or of transcripts
txL <- lengthOf(EnsDb.Hsapiens.v75, of="tx")

cheers, jo

Content
Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.