makeTxDbpackage after makeTxDbfromGFF --> Error in spc[[2]] : subscript out of bounds
1
0
Entering edit mode
@yacinebadis-8240
Last seen 9.2 years ago
United Kingdom

Dear all

I have been struggling for several hours with this, and after trying to follow all indications of the Genomic Features package and related help/forums, I am still stuck:

a  gff3 file and gtf are my only annotation ressource for Ectocarpus siliculosus (https://bioinformatics.psb.ugent.be/gdb/ectocarpus/EctsiV2_gff3_LATEST.tar.gz) and

(https://bioinformatics.psb.ugent.be/gdb/ectocarpus/EctsiV2_gtf_LATEST.tar.gz), and I want to forge a TxDb package for use in several Bioconductor applications.

 

I should maybe add here that annotation are organised in supercontigs sctgs and not Chr (can it be a source of problem?)

Untared Gff3 file is in fact split into a gff3 file for each supercontig, and is not accepted by makeTxDb().  I was able to obtain a TxDb object with either the .gtf file, or a gff3 obtained with cufflink’s gffread tool. Those two txdb objects are different in term of number of transcript, exons and cds.  Anyway makeTxDbpackage() give me always the same error  "Error in spc[[2]] : subscript out of bounds" with any of those TxDb objects, please see below.

  Can somebody have a look at this gtf file and tell me if the error is coming from the gtf file itself, or if I am missing something obvious here?

Seriously lost here, any help would be greatly appreciated!

Many Thanks,

 

Yacine

 

In Linux

gffread -E EctsiV2_all.gtf -o- > EctsiV2_all.gtf.gff3   


In R

> txdb <- makeTxDbFromGFF("EctsiV2_all.gtf", format="gtf", circ_seqs=character())
Prepare the 'metadata' data frame ... metadata: OK
Warning message:
In .reject_transcripts(bad_tx, because) :
  The following transcripts were rejected because they have CDSs that
  cannot be mapped to an exon: Esi0003_0153.1, Esi0015_0061.1,
  Esi0044_0024.1, Esi0074_0010.1, Esi0093_0006.1, Esi0098_0023.1,
  Esi0117_0096.1, Esi0123_0031.1, Esi0165_0042.1, Esi0168_0075.1,
  Esi0197_0020.1, Esi0205_0069.1, Esi0221_0029.1, Esi0264_0026.1,
  Esi0279_0055.1, Esi0304_0028.1, Esi0364_0035.1, Esi0369_0025.1,
  Esi0370_0030.1, Esi0376_0031.1, Esi0392_0017.1, Esi0423_0003.1,
  Esi0445_0015.1, Esi0651_0011.1, Esi0772_0002.1, Esi0798_0001.1,
  Esi1446_0002.1, Esi1480_0001.1, Esi1751_0001.1
> txdb2 <- makeTxDbFromGFF("EctsiV2_all.gtf.gff3", format="gff3", circ_seqs=character())
Prepare the 'metadata' data frame ... metadata: OK
> txdb
TxDb object:
# Db type: TxDb
# Supporting package: GenomicFeatures
# Data source: EctsiV2_all.gtf
# Organism: NA
# miRBase build ID: NA
# Genome: NA
# transcript_nrow: 18406
# exon_nrow: 162850
# cds_nrow: 136291
# Db created by: GenomicFeatures package from Bioconductor
# Creation time: 2015-06-22 09:19:39 +0100 (Mon, 22 Jun 2015)
# GenomicFeatures version at creation time: 1.20.1
# RSQLite version at creation time: 1.0.0
# DBSCHEMAVERSION: 1.1
> txdb2
TxDb object:
# Db type: TxDb
# Supporting package: GenomicFeatures
# Data source: EctsiV2_all.gtf.gff3
# Organism: NA
# miRBase build ID: NA
# Genome: NA
# transcript_nrow: 18435
# exon_nrow: 139594
# cds_nrow: 136421
# Db created by: GenomicFeatures package from Bioconductor
# Creation time: 2015-06-22 09:20:27 +0100 (Mon, 22 Jun 2015)
# GenomicFeatures version at creation time: 1.20.1
# RSQLite version at creation time: 1.0.0
# DBSCHEMAVERSION: 1.1
> makeTxDbPackage()
Error in AnnotationDbi:::dbconn(txdb) :
  error in evaluating the argument 'x' in selecting a method for function 'dbconn': Error: argument "txdb" is missing, with no default
> makeTxDbPackage(txdb=txdb)
> ls()
[1] "txdb"  "txdb2"
> makeTxDbPackage(txdb=txdb, version="0.1", maintainer="<Yacine.Badis@sams.ac.uk>", author= "Yacine Badis", destDir=".", license= "Artistic-2.0")
Error in spc[[2]] : subscript out of bounds
> makeTxDbPackage(txdb=txdb2, version="0.1", maintainer="<Yacine.Badis@sams.ac.uk>", author= "Yacine Badis", destDir=".", license= "Artistic-2.0")
Error in spc[[2]] : subscript out of bounds
> makeTxDbPackage(txdb=txdb2)
Error in spc[[2]] : subscript out of bounds

> traceback()
4: paste0(substr(spc[[1]], 1, 1), spc[[2]])
3: .abbrevOrganismName(.getMetaDataValue(txdb, "Organism"))
2: .makePackageName(txdb)
1: makeTxDbPackage(txdb = txdb2)
>

 

> sessionInfo()
R version 3.2.0 (2015-04-16)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.2 LTS

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets
[8] methods   base     

other attached packages:
[1] GenomicFeatures_1.20.1 AnnotationDbi_1.30.1   Biobase_2.28.0        
[4] GenomicRanges_1.20.5   GenomeInfoDb_1.4.1     IRanges_2.2.4         
[7] S4Vectors_0.6.0        BiocGenerics_0.14.0   

loaded via a namespace (and not attached):
 [1] XML_3.98-1.1            Rsamtools_1.20.4        Biostrings_2.36.1      
 [4] GenomicAlignments_1.4.1 bitops_1.0-6            futile.options_1.0.0   
 [7] DBI_0.3.1               RSQLite_1.0.0           zlibbioc_1.14.0        
[10] XVector_0.8.0           futile.logger_1.4.1     lambda.r_1.1.7         
[13] BiocParallel_1.2.3      tools_3.2.0             biomaRt_2.24.0         
[16] RCurl_1.95-4.6          rtracklayer_1.28.5     
>

 

gtf TxDb maketxdbfromgff • 2.5k views
ADD COMMENT
0
Entering edit mode
@herve-pages-1542
Last seen 3 days ago
Seattle, WA, United States

Hi Yacine,

The error message doesn't help but it seems that makeTxDbPackage() was not able to automatically give a name to the package to be created. You can either:

  1. Use the devel version of BioC (BioC 3.2) where I think an extra argument was added to makeTxDbPackage() to let the user specify the name of the package to be created.
  2. Or, with BioC 3.1 (i.e. the current release and the version you are using), specify the organism argument when you call makeTxDbFromGFF()makeTxDbPackage() will then use that information to infer the name to give to the package to be created. Make sure the string you pass to organism is of the form "<Genus> <species>" or makeTxDbPackage() will fail in the same manner as you experienced.

Hope this helps,

H.

ADD COMMENT
0
Entering edit mode

I tried option 2, and package was successfully made!

Many thanks for your help Herve!

Yacine

ADD REPLY

Login before adding your answer.

Traffic: 529 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6