Dear all,
I encountered a really long elapsed time for importing a small toy gff3 file
when using `import.gff3()`, here is what I did:
> library(rtracklayer) > library(TxDb.Hsapiens.UCSC.hg19.knownGene) > txdb = TxDb.Hsapiens.UCSC.hg19.knownGene > grl.exons = exonsBy(txdb, by = 'tx')[1:42] > export.gff3(grl.exons, 'test.gff3') > system.time(import.gff3('test.gff3')) user system elapsed 0.892 0.092 60.428
It takes about one minute to import it, and the time varies a lot when I repeat.
I don't understand what's going on in this case, and I don't know if it can apply
to you or just on my system.
Any suggestion?
> sessionInfo() R version 3.3.2 (2016-10-31) Platform: x86_64-suse-linux-gnu (64-bit) Running under: openSUSE Tumbleweed locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats4 stats graphics grDevices utils [7] datasets methods base other attached packages: [1] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2 [2] GenomicFeatures_1.26.0 [3] AnnotationDbi_1.36.0 [4] Biobase_2.34.0 [5] rtracklayer_1.34.1 [6] GenomicRanges_1.26.1 [7] GenomeInfoDb_1.10.1 [8] IRanges_2.8.1 [9] S4Vectors_0.12.0 [10] BiocGenerics_0.20.0 loaded via a namespace (and not attached): [1] XVector_0.14.0 zlibbioc_1.20.0 [3] GenomicAlignments_1.10.0 BiocParallel_1.8.1 [5] BSgenome_1.42.0 lattice_0.20-34 [7] tools_3.3.2 SummarizedExperiment_1.4.0 [9] grid_3.3.2 DBI_0.5-1 [11] Matrix_1.2-7.1 bitops_1.0-6 [13] RCurl_1.95-4.8 biomaRt_2.30.0 [15] RSQLite_1.0.0 BiocInstaller_1.24.0 [17] Biostrings_2.42.0 Rsamtools_1.26.1 [19] XML_3.98-1.4
Martin, always thanks for your incisive help. I installed the BSgenome and BSgenome.Hsapiens.UCSC.hg19 packages and it solved the problem.