rtracklayer import.gff3 function no longer working in R 3.4
1
0
Entering edit mode
Keith Hughitt ▴ 170
@keith-hughitt-6740
Last seen 10 months ago
United States

After upgrading to R 3.4/Bioconductor 3.4 today, the `import.gff3` function (and presumably other related functions) are no longer working for me.

For example, using this sample GFF3 file from ENSEMBL, with spaces replaced with tabs:

##gff-version    3
ctg123    .    mRNA    1300    9000    .    +    .    ID=mrna0001;Name=sonichedgehog
ctg123    .    exon    1300    1500    .    +    .    ID=exon00001;Parent=mrna0001
ctg123    .    exon    1050    1500    .    +    .    ID=exon00002;Parent=mrna0001
ctg123    .    exon    3000    3902    .    +    .    ID=exon00003;Parent=mrna0001
ctg123    .    exon    5000    5500    .    +    .    ID=exon00004;Parent=mrna0001
ctg123    .    exon    7000    9000    .    +    .    ID=exon00005;Parent=mrna0001

Attempting to load the file results in the following error:

> library('rtracklayer')
> import.gff3('test.gff')
Error in match.arg(pruning.mode) : 
  'arg' should be one of “error”, “coarse”, “fine”, “tidy”
Calls: import.gff3 ... genome<- -> seqinfo<- -> seqinfo<- -> <Anonymous> -> match.arg

I checked if the issue persists in R 3.5/Bioconductor 3.5, and it seems to work fine there.

Since R 3.4 was just released and it will be a while before 3.5 hits the shelves, it may be helpful to backport the fix to Bioconductor 3.4.

Regards,

Keith

System info:

R 3.4

> sessionInfo()
R version 3.4.0 (2017-04-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Arch Linux

Matrix products: default
BLAS: /usr/lib/libblas.so.3.7.0
LAPACK: /usr/lib/liblapack.so.3.7.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
[1] rtracklayer_1.34.2   GenomicRanges_1.26.4 GenomeInfoDb_1.11.11
[4] IRanges_2.8.2        S4Vectors_0.12.2     BiocGenerics_0.20.0 
[7] colorout_1.1-2      

loaded via a namespace (and not attached):
 [1] lattice_0.20-35            XML_3.98-1.6              
 [3] Rsamtools_1.26.2           Biostrings_2.42.1         
 [5] GenomicAlignments_1.10.1   bitops_1.0-6              
 [7] grid_3.4.0                 zlibbioc_1.20.0           
 [9] XVector_0.14.1             Matrix_1.2-9              
[11] BiocParallel_1.8.2         tools_3.4.0               
[13] Biobase_2.34.0             RCurl_1.95-4.8            
[15] compiler_3.4.0             SummarizedExperiment_1.4.0
[17] GenomeInfoDbData_0.99.0 

R SVN (3.5)

> sessionInfo()
R Under development (unstable) (2017-04-24 r72617)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Arch Linux

Matrix products: default
BLAS: /usr/lib/libblas.so.3.7.0
LAPACK: /usr/lib/liblapack.so.3.7.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
[1] rtracklayer_1.35.12   GenomicRanges_1.27.23 GenomeInfoDb_1.11.11 
[4] IRanges_2.9.19        S4Vectors_0.13.17     BiocGenerics_0.21.3  
[7] colorout_1.1-2       

loaded via a namespace (and not attached):
 [1] XVector_0.15.2              zlibbioc_1.21.0            
 [3] GenomicAlignments_1.11.12   BiocParallel_1.9.6         
 [5] lattice_0.20-35             tools_3.5.0                
 [7] SummarizedExperiment_1.5.10 grid_3.5.0                 
 [9] Biobase_2.35.1              matrixStats_0.52.2         
[11] Matrix_1.2-9                GenomeInfoDbData_0.99.0    
[13] bitops_1.0-6                RCurl_1.95-4.8             
[15] DelayedArray_0.1.11         compiler_3.5.0             
[17] Biostrings_2.43.8           Rsamtools_1.27.16          
[19] XML_3.98-1.6 

 

 

 
rtracklayer import.gff3 gff • 1.4k views
ADD COMMENT
1
Entering edit mode

It looks like GenomeInfoDb is from Bioconductor 3.5, not 3.4. And just so it's clear, Bioc 3.5 is targeting R 3.4, not R 3.5. We will not target R 3.5 until Bioc 3.7. It's only coincidental that the versions are so similar now.

ADD REPLY
0
Entering edit mode

Good catch! It looks like the "3.4" directory may have been previously used by the SVN version of of R, and some newer packages got mixed in. Martin's suggestion below further supports this. Thanks for the insight!

ADD REPLY
2
Entering edit mode
@martin-morgan-1513
Last seen 18 days ago
United States

Does BiocInstaller::biocValid() suggest any packages that are out-of-sync (too old or too new) for your installation?

ADD COMMENT
0
Entering edit mode

Indeed it does!

downgrade with biocLite(c("ExpressionAtlas", "GenomeInfoDb", "OrganismDbi", "knitcitations"))
Error: 288 package(s) out of date; 4 package(s) too new   

In the past, I would just use "biocLite()" to see which packages were out-of-date, but that was not returning anything in this case so everything appeared to be in order.

Reinstalling the packages per the suggestion in the biocValid() output fixed the issue.

Thanks for the help!

ADD REPLY

Login before adding your answer.

Traffic: 206 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6