Hello!
I am trying to make exon annotation files for calculateRPKM function in customprodb and have followed instructions to get genomic and protein fasta files to use with PrepareAnnotationRefseq. But I need these for hg38 and it seems to not find the refgene track. Snippet below
PrepareAnnotationRefseq(genome = "hg38",CDSfasta = "hg38_customprodb",pepfasta = "hg38_customprodb_prot",annotation_path = "annotations/")
Build TranscriptDB object (txdb.sqlite) ...
Download the refGene table ... OK
Download the hgFixed.refLink table ... OK
Extract the 'transcripts' data frame ... OK
Extract the 'splicings' data frame ... OK
Download and preprocess the 'chrominfo' data frame ... OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... OK
done
Prepare gene/transcript/protein id mapping information (ids.RData) ... Error in normArgTrack(track, trackids) : Unknown track: refGene
In addition: Warning message:
In .extractCdsLocsFromUCSCTxTable(ucsc_txtable, exon_locs) :
UCSC data anomaly in 581 transcript(s): the cds cumulative length is not a multiple
of 3 for transcripts ‘NM_020469’ ‘NM_032470’ ‘NM_032470’ ‘NM_032470’ ‘NM_032470’
‘NM_006806’ ‘NM_004522’ ‘NM_005155’ ‘NM_001281971’ ‘NM_014512’ ‘NM_015868’
‘NM_015868’ ‘NM_015868’ ‘NM_015868’ ‘NM_015868’ ‘NM_015868’ ‘NM_004715’
‘NM_001282170’ ‘NM_001321102’ ‘NM_080598’ ‘NM_000363’ ‘NM_015068’ ‘NM_001348266’
‘NM_001291485’ ‘NM_153443’ ‘NM_153443’ ‘NM_001349989’ ‘NM_001114397’ ‘NM_001954’
‘NM_001318855’ ‘NM_001318854’ ‘NM_002457’ ‘NM_001304561’ ‘NM_023035’
‘NM_001349991’ ‘NM_025259’ ‘NM_001256510’ ‘NM_014608’ ‘NM_006781’ ‘NM_012314’
‘NM_001322168’ ‘NM_012313’ ‘NM_001242357’ ‘NM_006295’ ‘NM_001306077’
‘NM_001145457’ ‘NM_001282170’ ‘NM_001282171’ ‘NM_001282171’ ‘NM_001282171’
‘NM_001282171 [... truncated]
sessioninfo -
sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Sierra 10.12.6
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] GenomicFeatures_1.30.3 GenomicRanges_1.30.1 GenomeInfoDb_1.14.0
[4] customProDB_1.18.0 biomaRt_2.34.2 AnnotationDbi_1.40.0
[7] Biobase_2.38.0 IRanges_2.12.0 S4Vectors_0.16.0
[10] BiocGenerics_0.24.0
loaded via a namespace (and not attached):
[1] Rcpp_0.12.15 plyr_1.8.4 pillar_1.1.0
[4] compiler_3.4.1 XVector_0.18.0 prettyunits_1.0.2
[7] bitops_1.0-6 tools_3.4.1 progress_1.1.2
[10] zlibbioc_1.24.0 digest_0.6.15 bit_1.1-12
[13] AhoCorasickTrie_0.1.0 BSgenome_1.46.0 lattice_0.20-35
[16] RSQLite_2.0 memoise_1.1.0 tibble_1.4.2
[19] pkgconfig_2.0.1 rlang_0.1.6 Matrix_1.2-12
[22] DelayedArray_0.4.1 DBI_0.7 GenomeInfoDbData_1.0.0
[25] rtracklayer_1.38.3 stringr_1.2.0 httr_1.3.1
[28] Biostrings_2.46.0 grid_3.4.1 bit64_0.9-7
[31] R6_2.2.2 RMySQL_0.10.13 XML_3.98-1.9
[34] BiocParallel_1.12.0 blob_1.1.0 magrittr_1.5
[37] matrixStats_0.53.0 Rsamtools_1.30.0 GenomicAlignments_1.14.1
[40] assertthat_0.2.0 SummarizedExperiment_1.8.1 stringi_1.1.6
[43] RCurl_1.95-4.10 VariantAnnotation_1.24.5