rtracklayer v1.30.0 can no longer import gff files
1
0
Entering edit mode
Jenny Drnevich ★ 2.0k
@jenny-drnevich-2812
Last seen 7 days ago
United States

Hello,

I just upgraded to R 3.2.2 / BioC 3.2 / rtracklayer 1.30.0, and some of my code to import NCBI's gff3 files now throws an error when it worked fine with R 3.2.1 / BioC 3.1 / rtracklayer 1.28.6. I have example codes below for both versions trying to import NCBI's mouse ref_GRCm38.p3_top_level.gff3.gz downloaded from ftp://ftp.ncbi.nlm.nih.gov/genomes/M_musculus/GFF/. The new rtracklayer is throwing an error about "cannnot determine seqnames column unambiguously". I looked through the help files for ?import.gff3 in both versions, but don't see any changes. Is this a new bug? My workaround is to save the GRanges object from the old version as a .RData file and then load it into the new R/BioC, which seems to work fine. Any help in getting the new rtracklayer to read in gff file would be appreciated!

Thanks,

Jenny

R 3.2.1 / BioC 3.1 / rtracklayer 1.28.6:

R version 3.2.1 (2015-06-18) -- "World-Famous Astronaut"
Copyright (C) 2015 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)

#lines removed ...

> .libPaths()
[1] "C:/Users/drnevich/Documents/R/win-library/3.2"
[2] "C:/Program Files/R/R-3.2.1/library"           
> 
> #Change to point to old BioC3.1 packages I saved...
> 
> .libPaths(new = "C:/Users/drnevich/Documents/R/win-library/3.2_BioC3.1")
> 
> library(rtracklayer)
Loading required package: GenomicRanges
Loading required package: BiocGenerics
Loading required package: parallel
#lines removed...
> 
> setwd("D:/Statistics/Freund/Fire_sept2015/ReSeq/")
> 
> #mouse GFF downloaded from ftp://ftp.ncbi.nlm.nih.gov/genomes/M_musculus/GFF/
> 
> gff0 <- import("ref_GRCm38.p3_top_level.gff3.gz")
> #no errors!
> 
> save(gff0, file = "ref_GRCm38.p3_top_level.gff3.RData")
> 
> sessionInfo()
R version 3.2.1 (2015-06-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods  
[9] base     

other attached packages:
[1] rtracklayer_1.28.6   GenomicRanges_1.20.5 GenomeInfoDb_1.4.1   IRanges_2.2.5       
[5] S4Vectors_0.6.2      BiocGenerics_0.14.0 

loaded via a namespace (and not attached):
 [1] XML_3.98-1.3            Rsamtools_1.20.4        Biostrings_2.36.1      
 [4] bitops_1.0-6            GenomicAlignments_1.4.1 futile.options_1.0.0   
 [7] zlibbioc_1.14.0         XVector_0.8.0           futile.logger_1.4.1    
[10] lambda.r_1.1.7          BiocParallel_1.2.11     tools_3.2.1            
[13] RCurl_1.95-4.7 

 

R 3.2.2 / BioC 3.2 / rtracklayer 1.30.0:

R version 3.2.2 (2015-08-14) -- "Fire Safety"
Copyright (C) 2015 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)
#lines removed...

> .libPaths()
[1] "C:/Users/drnevich/Documents/R/win-library/3.2"
[2] "C:/Program Files/R/R-3.2.2/library"           
> #keep to use the new BioC 3.2 packages
> 
> library(rtracklayer)
Loading required package: GenomicRanges
Loading required package: BiocGenerics
Loading required package: parallel
#lines removed...
> 
> setwd("D:/Statistics/Freund/Fire_sept2015/ReSeq/")
> 
> #mouse GFF downloaded from ftp://ftp.ncbi.nlm.nih.gov/genomes/M_musculus/GFF/
> 
> gff0 <- import("ref_GRCm38.p3_top_level.gff3.gz")
Error in .find_seqnames_col(df_colnames0, seqnames.field0, prefix) : 
  cannnot determine seqnames column unambiguously
> 
> #Load in the RData file output from R 3.2.1:
> load("ref_GRCm38.p3_top_level.gff3.RData")
>
> #Check to see if I can use it: 
> table(gff0$type)

    C_gene_segment         cDNA_match                CDS     D_gene_segment 
                32               8710             937664                 24 
            D_loop               exon               gene     J_gene_segment 
                 1            1170729              48835                156 
             match               mRNA              ncRNA primary_transcript 
              7271              78013              24746               1283 
            region               rRNA   sequence_variant         transcript 
               195                 35                  6               7067 
              tRNA     V_gene_segment 
               437                613 

> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils    
[7] datasets  methods   base     

other attached packages:
[1] rtracklayer_1.30.0   GenomicRanges_1.22.0
[3] GenomeInfoDb_1.6.0   IRanges_2.4.0       
[5] S4Vectors_0.8.0      BiocGenerics_0.16.0 

loaded via a namespace (and not attached):
 [1] XML_3.98-1.3               Rsamtools_1.22.0          
 [3] Biostrings_2.38.0          GenomicAlignments_1.6.0   
 [5] bitops_1.0-6               futile.options_1.0.0      
 [7] zlibbioc_1.16.0            XVector_0.10.0            
 [9] futile.logger_1.4.1        lambda.r_1.1.7            
[11] BiocParallel_1.4.0         tools_3.2.2               
[13] Biobase_2.30.0             RCurl_1.95-4.7            
[15] SummarizedExperiment_1.0.0

 

rtracklayer import.gff3 bug • 1.6k views
ADD COMMENT
0
Entering edit mode
@herve-pages-1542
Last seen 7 days ago
Seattle, WA, United States

Hi Jenny,

This is a regression I introduced in import.gff() when I re-implemented it in BioC 3.2. I just fixed it in rtracklayer 1.30.1, which should become available tomorrow (Oct 22nd) via biocLite(). Thanks for the catch and sorry for the inconvenience.

Cheers,

H.

ADD COMMENT
0
Entering edit mode
Ok – it finally came through late Thursday night, in time for my workshop last Friday. Thank you!! Jenny From: Hervé Pagès [bioc] [mailto:noreply@bioconductor.org] Sent: Wednesday, October 21, 2015 2:36 PM To: Zadeh, Jenny Drnevich <drnevich@illinois.edu> Subject: [bioc] A: rtracklayer v1.30.0 can no longer import gff files Activity on a post you are following on support.bioconductor.org<https: urldefense.proofpoint.com="" v2="" url?u="&lt;a href=" http:="" <a="" href="http://https-3A__support.bioconductor.org" rel="nofollow">https-3A__support.bioconductor.org"="" rel="nofollow">https-3A__support.bioconductor.org&d=BQMDaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=6-Bh1FprrmfLCzuKMeeZ1IaQQqjRPj_xNcuCh6hXgSU&m=OqpoeIlfJ-7lDNMaHg-N2G_5AZ-w3Pbf0PKBcjJrur0&s=BHS3NcKYcb-Bpp3TJ5Kw_vjEJYJsrfg8nSfb11NT9qo&e="> User Hervé Pagès<https: urldefense.proofpoint.com="" v2="" url?u="https-3A__support.bioconductor.org_u_1542_&amp;d=BQMDaQ&amp;c=8hUWFZcy2Z-Za5rBPlktOQ&amp;r=6-Bh1FprrmfLCzuKMeeZ1IaQQqjRPj_xNcuCh6hXgSU&amp;m=OqpoeIlfJ-7lDNMaHg-N2G_5AZ-w3Pbf0PKBcjJrur0&amp;s=zhmYdOi_5vQ1Y7rNBqdhhpO2SrkLTetZX-tCmhnK7hY&amp;e="> wrote Answer: rtracklayer v1.30.0 can no longer import gff files<https: urldefense.proofpoint.com="" v2="" url?u="https-3A__support.bioconductor.org_p_73642_-2373652&amp;d=BQMDaQ&amp;c=8hUWFZcy2Z-Za5rBPlktOQ&amp;r=6-Bh1FprrmfLCzuKMeeZ1IaQQqjRPj_xNcuCh6hXgSU&amp;m=OqpoeIlfJ-7lDNMaHg-N2G_5AZ-w3Pbf0PKBcjJrur0&amp;s=0rvmaxkF_4TSykxaZyU5nNt9TSPoWRyhAzYC7g63jOg&amp;e=">: Hi Jenny, This is a regression I introduced in import.gff() when I re-implemented it in BioC 3.2. I just fixed it in rtracklayer 1.30.1, which should become available tomorrow (Oct 22nd) via biocLite(). Thanks for the catch and sorry for the inconvenience. Cheers, H. ________________________________ Post tags: rtracklayer, import.gff3, bug You may reply via email or visit A: rtracklayer v1.30.0 can no longer import gff files
ADD REPLY
0
Entering edit mode

Glad you got this in time for your workshop!  H.

ADD REPLY

Login before adding your answer.

Traffic: 401 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6