Import gff from character?
1
0
Entering edit mode
Marlin ▴ 20
@marlin-11371
Last seen 5.8 years ago

rtracklayer::import for signature "GFFFile,ANY,ANY" has an argument text, the help page says:

## S4 method for signature 'GFFFile,ANY,ANY'
import(con, format, text,
           version = c("", "1", "2", "3"),
           genome = NA, colnames = NULL, which = NULL,
           feature.type = NULL, sequenceRegionsAsSeqinfo = FALSE)

text:  If con is missing, a character vector to use as the input.

But the argument does not seem to work, in fact I have looked into the source https://github.com/Bioconductor-mirror/rtracklayer/blob/1d5b9b3979d277b1dfd5642fd714d7c7bc75ca84/R/gff.R#L233-L296 , it does not seem to support that argument. In my case, I have already read the file using readLines to check the hash, and I do not want to re-read file again or write it to a temporary file then import it. Is there any way to import gff from a character vector?

 

rtracklayer gff gtf import import.gff3 • 1.1k views
ADD COMMENT
3
Entering edit mode
@martin-morgan-1513
Last seen 3 days ago
United States

For a reproducible example I went to ?import and found a gff3 file. I read it in using readLines, and then read the date using the text connection

>        test_path <- system.file("tests", package = "rtracklayer")
>        test_gff3 <- file.path(test_path, "genes.gff3")
> gff = readLines(test_gff3)
> gr = import(format="gff3", text=gff)
Warning in readGFF(filepath, version = version, filter = filter) :
  connection is not positioned at the start of the file, rewinding it

The warning seems harmless. Here's my sessionInfo()

> sessionInfo()
R version 3.4.1 Patched (2017-07-04 r72894)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.2 LTS

Matrix products: default
BLAS: /home/mtmorgan/bin/R-3-4-branch/lib/libRblas.so
LAPACK: /home/mtmorgan/bin/R-3-4-branch/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
[1] rtracklayer_1.36.3   GenomicRanges_1.28.3 GenomeInfoDb_1.12.2 
[4] IRanges_2.10.2       S4Vectors_0.14.3     BiocGenerics_0.22.0 

loaded via a namespace (and not attached):
 [1] lattice_0.20-35            matrixStats_0.52.2        
 [3] XML_3.98-1.9               Rsamtools_1.28.0          
 [5] Biostrings_2.44.1          GenomicAlignments_1.12.1  
 [7] bitops_1.0-6               grid_3.4.1                
 [9] zlibbioc_1.22.0            XVector_0.16.0            
[11] Matrix_1.2-10              BiocParallel_1.11.3       
[13] tools_3.4.1                Biobase_2.36.2            
[15] RCurl_1.95-4.8             DelayedArray_0.2.7        
[17] compiler_3.4.1             SummarizedExperiment_1.6.3
[19] GenomeInfoDbData_0.99.0   

What did you try, and what is your sessionInfo()?

 

ADD COMMENT
0
Entering edit mode

Sorry, Martin, it was my fault, I was using import.gff and thought the arguments are passed to import, your example works for me. Thanks!

ADD REPLY

Login before adding your answer.

Traffic: 724 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6