Question: Error when trying to load a gff3 with GenomicFeatures
gravatar for gil.hornung
2.7 years ago by
gil.hornung0 wrote:


I downloaded the GFF file for S. cerevisiae from NCBI:

I'm trying to open the file with the following command:

txdb <- makeTranscriptDbFromGFF("genomes/Saccharomyces_cerevisiae/GCF_000146045.2_R64_genomic.gff",

 And I get the following error:

extracting transcript information
Extracting gene IDs
extracting transcript information
Processing splicing information for gff3 file.
Deducing exon rank from relative coordinates provided
Prepare the 'metadata' data frame ... metadata: OK
Now generating chrominfo from available sequence names. No chromosome length information is available.
Error in sqliteSendQuery(con, statement, :
  rsqlite_query_send: could not execute: UNIQUE constraint failed: splicing._tx_id, splicing.exon_rank
In addition: Warning messages:
1: In .deduceExonRankings(exs, format = "gff") :
  Infering Exon Rankings.  If this is not what you expected, then please be sure that you have provided a valid attribute for exonRankAttributeName
2: In matchCircularity(chroms, circ_seqs) :
  None of the strings in your circ_seqs argument match your seqnames.
3: 'dbBeginTransaction' is deprecated.
Use 'dbBegin' instead.
See help("Deprecated")
4: 'dbBeginTransaction' is deprecated.
Use 'dbBegin' instead.
See help("Deprecated")
5: 'dbBeginTransaction' is deprecated.
Use 'dbBegin' instead.
See help("Deprecated")
6: 'dbBeginTransaction' is deprecated.
Use 'dbBegin' instead.
See help("Deprecated")
7: 'dbBeginTransaction' is deprecated.
Use 'dbBegin' instead.
See help("Deprecated") 

> sessionInfo()
R version 3.1.1 (2014-07-10)
Platform: x86_64-unknown-linux-gnu (64-bit)

[1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8    
[8] LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] GenomicFeatures_1.18.7   AnnotationDbi_1.28.2     Biobase_2.26.0           DESeq2_1.6.3             RcppArmadillo_0.5.000.0  Rcpp_0.12.3              VariantAnnotation_1.12.9
[8] Rsamtools_1.18.3         Biostrings_2.34.1        XVector_0.6.0            GenomicRanges_1.18.4     GenomeInfoDb_1.2.5       IRanges_2.0.1            S4Vectors_0.4.0       
[15] BiocGenerics_0.12.1      amap_0.8-14              matrixStats_0.14.2       ggplot2_1.0.1            gplots_2.17.0            RColorBrewer_1.1-2       BiocParallel_1.0.3      

loaded via a namespace (and not attached):
[1] acepack_1.3-3.3         annotate_1.44.0         base64enc_0.1-3         BatchJobs_1.6           BBmisc_1.9              biomaRt_2.22.0          bitops_1.0-6            brew_1.0-6           
[9] BSgenome_1.34.1         caTools_1.17.1          checkmate_1.5.2         cluster_2.0.3           codetools_0.2-14        colorspace_1.2-6        DBI_0.3.1               digest_0.6.9         
[17] fail_1.3                foreach_1.4.3           foreign_0.8-66          Formula_1.2-1           gdata_2.17.0            genefilter_1.48.1       geneplotter_1.44.0      GenomicAlignments_1.2.2
[25] grid_3.1.1              gridExtra_2.0.0         gtable_0.1.2            gtools_3.5.0            Hmisc_3.17-0            iterators_1.0.8         KernSmooth_2.23-15      lattice_0.20-31      
[33] latticeExtra_0.6-28     locfit_1.5-9.1          magrittr_1.5            MASS_7.3-40             munsell_0.4.3           nnet_7.3-12             plyr_1.8.2              proto_0.3-10         
[41] RCurl_1.95-4.7          reshape2_1.4.1          rpart_4.1-10            RSQLite_1.0.0           rtracklayer_1.26.3      scales_0.3.0            sendmailR_1.2-1         splines_3.1.1        
[49] stringi_1.0-1           stringr_1.0.0           survival_2.38-3         tools_3.1.1             XML_3.98-1.3            xtable_1.8-2            zlibbioc_1.12.0        


ADD COMMENTlink modified 2.7 years ago by Mike Smith3.1k • written 2.7 years ago by gil.hornung0
gravatar for Mike Smith
2.7 years ago by
Mike Smith3.1k
EMBL Heidelberg / de.NBI
Mike Smith3.1k wrote:

You're using an old version of R and the GenomicFeatures package.  Perhaps try upgrading to R-3.2.4 and get the most recent version of GenomicFeatures.  The makeTranscriptDbFromGFF() function has been deprecated, removed, and replaced by makeTxDbFromGFF()

This combination works for me:

txdb <- makeTxDbFromGFF("GCF_000146045.2_R64_genomic.gff", format = "gff3")
Import genomic features from the file as a GRanges object ... OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... OK
Warning message:
In .find_exon_cds(exons, cds) :
  The following transcripts have exons that contain more than one CDS
  (only the first CDS was kept for each exon): rna1045, rna114, rna1154,
  rna1156, rna1208, rna1210, rna1266, rna1318, rna1738, rna1765, rna1867,
  rna210, rna2230, rna2249, rna228, rna2320, rna2377, rna2379, rna2559,
  rna2805, rna2911, rna2983, rna3144, rna3289, rna3291, rna4010, rna4084,
  rna4269, rna4420, rna4426, rna4522, rna4529, rna4873, rna5098, rna5303,
  rna5557, rna5610, rna5655, rna5755, rna576, rna5834, rna6032, rna6040,
  rna6223, rna6247, rna6249, rna973


ADD COMMENTlink written 2.7 years ago by Mike Smith3.1k

Thank you, Mike!

ADD REPLYlink written 2.7 years ago by gil.hornung0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 326 users visited in the last hour