Entering edit mode
Hello,
as I can open an xz-compressed GFF file with read.delim
but not with rtracklayer::import
, I suppose that there is a but in rtracklayer
…
The error message is:
Error in .normarg_input_filepath(filepath) :
file "test.gff3.xz" has unsupported type: xzfile
The session below look longish but I just:
import
a GFF fileimport
a gz-compressed GFF file- fail to
import
an xz-compressed GFF file - read the xz-compressed GFF file as a
data.frame
withread.delim
R version 4.1.1 (2021-08-10) -- "Kick Things"
Copyright (C) 2021 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library('rtracklayer')
Loading required package: GenomicRanges
Loading required package: stats4
Loading required package: BiocGenerics
Attaching package: ‘BiocGenerics’
The following objects are masked from ‘package:stats’:
IQR, mad, sd, var, xtabs
The following objects are masked from ‘package:base’:
anyDuplicated, append, as.data.frame, basename, cbind, colnames,
dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
union, unique, unsplit, which.max, which.min
Loading required package: S4Vectors
Attaching package: ‘S4Vectors’
The following objects are masked from ‘package:base’:
expand.grid, I, unname
Loading required package: IRanges
Loading required package: GenomeInfoDb
> import('test.gff3')
GRanges object with 1 range and 6 metadata columns:
seqnames ranges strand | source type score phase
<Rle> <IRanges> <Rle> | <factor> <factor> <numeric> <integer>
[1] ctg123 1000-9000 + | NA gene NA <NA>
ID Name
<character> <character>
[1] gene00001 EDEN
-------
seqinfo: 1 sequence from an unspecified genome; no seqlengths
> import('test.gff3.gz')
GRanges object with 1 range and 6 metadata columns:
seqnames ranges strand | source type score phase
<Rle> <IRanges> <Rle> | <factor> <factor> <numeric> <integer>
[1] ctg123 1000-9000 + | NA gene NA <NA>
ID Name
<character> <character>
[1] gene00001 EDEN
-------
seqinfo: 1 sequence from an unspecified genome; no seqlengths
> import('test.gff3.xz')
Error in .normarg_input_filepath(filepath) :
file "test.gff3.xz" has unsupported type: xzfile
> read.delim('test.gff3.xz', comment.char='#')
[1] ctg123 . gene
[4] X1000 X9000 ..1
[7] X. ..2 ID.gene00001.Name.EDEN
<0 rows> (or 0-length row.names)
> sessionInfo()
R version 4.1.1 (2021-08-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 11 (bullseye)
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
locale:
[1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
[5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8
[7] LC_PAPER=en_GB.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] rtracklayer_1.53.1 GenomicRanges_1.45.0 GenomeInfoDb_1.29.5
[4] IRanges_2.27.2 S4Vectors_0.31.3 BiocGenerics_0.39.2
loaded via a namespace (and not attached):
[1] rstudioapi_0.13 XVector_0.33.0
[3] zlibbioc_1.39.0 GenomicAlignments_1.29.0
[5] BiocParallel_1.27.4 lattice_0.20-44
[7] rjson_0.2.20 tools_4.1.1
[9] grid_4.1.1 SummarizedExperiment_1.23.4
[11] parallel_4.1.1 Biobase_2.53.0
[13] matrixStats_0.60.1 yaml_2.2.1
[15] crayon_1.4.1 BiocIO_1.3.0
[17] Matrix_1.3-3 GenomeInfoDbData_1.2.6
[19] restfulr_0.0.13 bitops_1.0-7
[21] RCurl_1.98-1.4 DelayedArray_0.19.2
[23] compiler_4.1.1 MatrixGenerics_1.5.4
[25] Biostrings_2.61.2 Rsamtools_2.9.1
[27] XML_3.99-0.7