Entering edit mode
Michael Dondrup
▴
550
@michael-dondrup-3849
Last seen 10.2 years ago
Hi,
I am trying to read in a genome annotation from a GFF3 file from NCBI
[1]
The file is about 7.5 MB and has ~17000 non-comment lines. While I can
read the file
with read.delim in less than a second, trying
bsub = import.gff("~/Downloads/bsubtilis.gff")
is very slow. I would rather like to use a standardized function form
the package
that understands various formats, but currently I cannot use it for
whole genome
annotation. Could this be improved, or is the fie format incorrect?
Best
Michael
[1]: ftp://ftp.ncbi.nih.gov/genbank/genomes/Bacteria/Bacillus_subtilis
/AL009126.gff
> sessionInfo()R version 2.11.1 (2010-05-31)
x86_64-apple-darwin9.8.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] rtracklayer_1.8.1 RCurl_1.4-2 bitops_1.0-4.1
loaded via a namespace (and not attached):
[1] Biobase_2.8.0 Biostrings_2.16.0 BSgenome_1.16.1
[4] GenomicRanges_1.0.9 IRanges_1.6.6 XML_3.1-0
>