The UCSC Genome browser have updated GENCODE track to v24 and track "GENCODE v22" is not longer supported. This affects makeTxDbFromUCSC()
and makeTxDbPackageFromUCSC()
that depend on supportedUCSCtables("hg38")
. So using the current GenomicFeature v1.24.5 on R 3.3 or R3.4, we have
> supportedUCSCtables(genome="hg38") track subtrack knownGene GENCODE v22 <NA> knownGeneOld3 Old UCSC Genes <NA> ccdsGene CCDS <NA> refGene RefSeq Genes <NA> xenoRefGene Other RefSeq <NA> vegaGene Vega Genes Vega Protein Genes vegaPseudoGene Vega Genes Vega Pseudogenes
> TxDb <- makeTxDbFromUCSC(genome="hg38", tablename="knownGene") Error in normArgTrack(track, trackids) : Unknown track: GENCODE v22
> sessionInfo() R Under development (unstable) (2016-09-18 r71304) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 14.04.3 LTS locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets [8] methods base other attached packages: [1] GenomicFeatures_1.24.5 AnnotationDbi_1.34.4 Biobase_2.32.0 [4] GenomicRanges_1.24.2 GenomeInfoDb_1.8.7 IRanges_2.6.1 [7] S4Vectors_0.10.3 BiocGenerics_0.18.0 BiocInstaller_1.22.3
I am able to get TxDb from UCSC gencode v24 track using the following code, but having GenomicFeatures functions to do the tedious work would be handy.
library(GenomicFeatures) library(rtracklayer) ucsc_txtable <- getTable(ucscTableQuery(session, track="GENCODE v24", table="knownGene")) tablename="knownGene" track="GENCODE v24" mapdef <- GenomicFeatures:::.howToGetTxName2GeneIdMapping("knownGene") txname2geneid <- GenomicFeatures:::.fetchTxName2GeneIdMappingFromUCSC(session, track, tablename, mapdef) transcript_ids = NULL txdb <- GenomicFeatures:::.makeTxDbFromUCSCTxTable(ucsc_txtable, txname2geneid$genes, genome="hg38", tablename, track, txname2geneid$gene_id_type, full_dataset = is.null(transcript_ids), circ_seqs = DEFAULT_CIRC_SEQS, taxonomyId = NA, miRBaseBuild = NA)
NOTE: I tried to use the devel version of GenomicFeature v1.25.16 download from https://bioconductor.org/packages/devel/bioc/html/GenomicFeatures.html, but the knownGene tablename is not available. Therefore, I cannot use MakeTxDbFromUCSC() to build the TxDb from Gencode track. I thought Herve has fixed it on v1.25.17, but I cannot see the updated version of package on the devel page. And I am not sure if Herve also has updated the track name to GENCODE v24 on the devel version v1.25.17.
Any update will be appreciated.
Thanks,
Chao-Jen
Thank you, Valerie and Herve.