I have an issue using the functionality of the R package customProDB to download annotation data from Ensembl as described in the documentation (2.2 ENSEMBL annotation from BIOMART). To check for yourself use browseVignettes("customProDB")
. I saw multiple other users had the issue in the past like here: GenomicFeatures makeTranscriptDbFromBiomart 'chrominfo' data frame ... FAILED! . However, those threads weren't useful to resolve my issue.
The issue is that it fails to download the 'chrominfo' data frame. A subsequent function of customProDB however needs that data.
How do I get the chrominfo?
# install necessary packages
# if (!requireNamespace("BiocManager", quietly = TRUE))
# install.packages("BiocManager")
#
# BiocManager::install("customProDB")
library(customProDB)
library(seqinr)
library(glue)
library(rtracklayer)
# open documentation in browser
# browseVignettes("customProDB")
# specify annotation data from ENSEMBL
ensembl <- useMart(
"ENSEMBL_MART_ENSEMBL",
dataset="hsapiens_gene_ensembl",
host="dec2021.archive.ensembl.org",
path="/biomart/martservice",
archive=FALSE
)
# where to save the downloaded data
annotation_path <- "~/Documents/ncHLAII_immunopeptidomics/ensembl"
dir.create(annotation_path)
# download and save the annotation data
PrepareAnnotationEnsembl(
mart=ensembl,
annotation_path=annotation_path,
splice_matrix=FALSE,
dbsnp=NULL,
COSMIC=FALSE
)
Output from Console:
Prepare gene/transcript/protein id mapping information (ids.RData) ... done
Build TranscriptDB object (txdb.sqlite) ...
Download and preprocess the 'transcripts' data frame ... OK
Download and preprocess the 'chrominfo' data frame ... FAILED! (=> skipped)
Download and preprocess the 'splicings' data frame ... OK
Download and preprocess the 'genes' data frame ... OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ...