Off topic:DEXSeq with FlyBase gff File
0
0
Entering edit mode
Assa Yeroslaviz ★ 1.5k
@assa-yeroslaviz-1597
Last seen 10 weeks ago
Germany

Hi,

I am trying to prepare my data for DEXSeq with the python script provided by the package.

I have downloaded the file from here. it is a gff file of considerable size (over 2Gb when unzipped).

wget ftp://flybase.net/genomes/Drosophila_melanogaster/current/gff/dmel-all-r6.03.gff.gz
gunzip dmel-all-r6.03.gff.gz
mv dmel-all-r6.03.gff.gz Dmel_r6.03.gff
python  ~/R/x86_64-pc-linux-gnu-library/3.1/DEXSeq/python_scripts/dexseq_prepare_annotation.py Dmel_r6.03.gff gffFiles/Dmel.DEXSeq.r6.03.gtf

When I try to run the python script to convert it to a DEXSeq-friendly format I get the error message:

Traceback (most recent call last):
  File "/home/yeroslaviz/R/x86_64-pc-linux-gnu-library/3.1/DEXSeq/python_scripts/dexseq_prepare_annotation.py", line 54, in <module>
    f.attr['gene_id'] = f.attr['gene_id'].replace( ":", "_" )
KeyError: 'gene_id'

Is there a way to make DEXSeq works with this file?

Assa

 

 

 

sessionInfo()
R version 3.1.0 (2014-04-10)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets
[8] methods   base     

other attached packages:
 [1] DEXSeq_1.12.1           BiocParallel_1.0.0      DESeq2_1.6.1           
 [4] RcppArmadillo_0.4.500.0 Rcpp_0.11.3             GenomicRanges_1.18.1   
 [7] GenomeInfoDb_1.2.3      IRanges_2.0.0           S4Vectors_0.4.0        
[10] Biobase_2.26.0          BiocGenerics_0.12.1    

loaded via a namespace (and not attached):
 [1] acepack_1.3-3.3      annotate_1.44.0      AnnotationDbi_1.28.1
 [4] base64enc_0.1-2      BatchJobs_1.5        BBmisc_1.8          
 [7] biomaRt_2.22.0       Biostrings_2.34.0    bitops_1.0-6        
[10] brew_1.0-6           checkmate_1.5.0      cluster_1.15.3      
[13] codetools_0.2-9      colorspace_1.2-4     DBI_0.3.1           
[16] digest_0.6.4         fail_1.2             foreach_1.4.2       
[19] foreign_0.8-61       Formula_1.1-2        genefilter_1.48.1   
[22] geneplotter_1.44.0   ggplot2_1.0.0        grid_3.1.0          
[25] gtable_0.1.2         Hmisc_3.14-5         hwriter_1.3.2       
[28] iterators_1.0.7      lattice_0.20-29      latticeExtra_0.6-26
[31] locfit_1.5-9.1       MASS_7.3-35          munsell_0.4.2       
[34] nnet_7.3-8           plyr_1.8.1           proto_0.3-10        
[37] RColorBrewer_1.0-5   RCurl_1.95-4.3       reshape2_1.4        
[40] rpart_4.1-8          Rsamtools_1.18.2     RSQLite_1.0.0       
[43] scales_0.2.4         sendmailR_1.2-1      splines_3.1.0       
[46] statmod_1.4.20       stringr_0.6.2        survival_2.37-7     
[49] tools_3.1.0          XML_3.98-1.1         xtable_1.7-4        
[52] XVector_0.6.0        zlibbioc_1.12.0    
dexseq gff gtf drosophila melanogaster • 1.2k views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 768 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6