Question: I can't import my gff file by `importTransprits()`,when I worked with SGSeq.
1
gravatar for 2315440517
4 months ago by
23154405170
23154405170 wrote:

I have some trouble when I worked with SGSeq. The species that we study has no available transcription annotation from TxDb, so I need to import my own gff file by function importTranscripts(), but I can't achieve it with the error "Error: subscript contains invalid names". I wander either the fault of my oprions or the something wrong with my gff file.

These are my code:

fname<-file.choose("Triplophysarosa.evm.gff")
fname
## "D:\\bioinformation\\Triplophysarosa\\05.Annotation\\02.Gene_Prediction\\Triplophysarosa.evm.gff"
file.exists(fname)
##[1] TRUE
importTranscripts("D:\\bioinformation\\Triplophysarosa\\05.Annotation\\02.Gene_Prediction\\Triplophysarosa.evm.gff",tag_tx = "contig1",tag_gene = "evm.TU.contig365.6")
 Hide Traceback
Error: subscript contains invalid names
10.
stop(wmsg(...), call. = FALSE)
9.
.subscript_error("subscript contains invalid ", what)
8.
NSBS(i, x, exact = exact, strict.upper.bound = !allow.append, allow.NAs = allow.NAs)
7.
NSBS(i, x, exact = exact, strict.upper.bound = !allow.append, allow.NAs = allow.NAs)
6.
normalizeSingleBracketSubscript(j, xstub)
5.
mcols(exons)[c(tag_tx, tag_gene)]
4.
mcols(exons)[c(tag_tx, tag_gene)]
3.
data.frame(mcols(exons)[c(tag_tx, tag_gene)])
2.
unique(data.frame(mcols(exons)[c(tag_tx, tag_gene)]))
1.
importTranscripts(fname, tag_tx = "contig1", tag_gene = "evm.TU.contig365.6")
sessionInfo()
R version 3.5.2 (2018-12-20)
Platform: i386-w64-mingw32/i386 (32-bit)
Running under: Windows 7 x64 (build 7600)

Matrix products: default

locale:
[1] LC_COLLATE=Chinese (Simplified)_People's Republic of China.936 
[2] LC_CTYPE=Chinese (Simplified)_People's Republic of China.936   
[3] LC_MONETARY=Chinese (Simplified)_People's Republic of China.936
[4] LC_NUMERIC=C                                                   
[5] LC_TIME=Chinese (Simplified)_People's Republic of China.936    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices
[6] utils     datasets  methods   base     

other attached packages:
 [1] SGSeq_1.16.2               
 [2] SummarizedExperiment_1.12.0
 [3] DelayedArray_0.8.0         
 [4] BiocParallel_1.16.6        
 [5] matrixStats_0.54.0         
 [6] Biobase_2.42.0             
 [7] Rsamtools_1.34.1           
 [8] Biostrings_2.50.2          
 [9] XVector_0.22.0             
[10] GenomicRanges_1.34.0       
[11] GenomeInfoDb_1.18.2        
[12] IRanges_2.16.0             
[13] S4Vectors_0.20.1           
[14] BiocGenerics_0.28.0        

loaded via a namespace (and not attached):
 [1] magrittr_1.5             GenomicFeatures_1.34.4  
 [3] gtable_0.2.0             zlibbioc_1.28.0         
 [5] memoise_1.1.0            hms_0.4.2               
 [7] RCurl_1.95-4.12          pillar_1.3.1            
 [9] progress_1.2.0           stringr_1.4.0           
[11] lattice_0.20-38          rtracklayer_1.42.2      
[13] bit_1.1-14               plyr_1.8.4              
[15] knitr_1.22               GenomicAlignments_1.18.1
[17] igraph_1.2.4             pkgconfig_2.0.2         
[19] Matrix_1.2-15            R6_2.4.0                
[21] GenomeInfoDbData_1.2.0   digest_0.6.18           
[23] xfun_0.5                 colorspace_1.4-0        
[25] AnnotationDbi_1.44.0     stringi_1.3.1           
[27] lazyeval_0.2.1           yaml_2.2.0              
[29] RSQLite_2.1.1            tibble_2.0.1            
[31] httr_1.4.0               compiler_3.5.2          
[33] bit64_0.9-7              munsell_0.5.0           
[35] DBI_1.0.0                Rcpp_1.0.0              
[37] biomaRt_2.38.0           XML_3.98-1.19           
[39] RUnit_0.4.32             assertthat_0.2.0        
[41] blob_1.1.1               ggplot2_3.1.0           
[43] prettyunits_1.0.2        tools_3.5.2             
[45] bitops_1.0-6             scales_1.0.0            
[47] crayon_1.3.4             rlang_0.3.1             
[49] grid_3.5.2  
annotation software error • 132 views
ADD COMMENTlink modified 4 months ago by Leonard Goldstein80 • written 4 months ago by 23154405170
Answer: I can't import my gff file by `importTransprits()`,when I worked with SGSeq.
1
gravatar for Leonard Goldstein
4 months ago by
United States
Leonard Goldstein80 wrote:

When posting a question about a software package, please always tag your post with the package name. This triggers an automatic email from the system to the package maintainer.

Regarding your question -- the tag_tx and tag_gene arguments are for specifying names of the relevant GFF tags. Please try

importTranscripts("D:\\bioinformation\\Triplophysarosa\\05.Annotation\\02.Gene_Prediction\\Triplophysarosa.evm.gff",tag_tx = "ID",tag_gene = "Name")
ADD COMMENTlink written 4 months ago by Leonard Goldstein80

Thank you very much, Imy problem has been solved.

ADD REPLYlink written 4 months ago by 23154405170

Dear professor, I have another question,whether I can use SGSeq package to recognize all of splice events accross the whole genome, only run it once. Or I just can get the splice events from a particular gene for once running?

ADD REPLYlink written 3 months ago by 23154405170

Both are possible. Please see vignette section 6 for an example on how to analyze a particular region of the genome using the which argument. If no region is specified the analysis is genome-wide. Depending on your data set, genome-wide predictions can be computationally intensive. In case you run into problems, please see vignette section 13 for considerations about parallelization and memory requirements.

ADD REPLYlink written 3 months ago by Leonard Goldstein80
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 154 users visited in the last hour