Splicing analysis with ASpli package gives an error when I use binGenome function.
0
0
Entering edit mode
@salvocomplicazioni1-14125
Last seen 21 days ago
Germany

I'd like to run Aspli packages but when i run the following code:

library(GenomicFeatures)
library(ASpli)
genomeTxDb <- makeTxDbFromGFF(par.l$GTF)
features <-binGenome(genomeTxDb)
`

I get back this error message:

181 genes were dropped because they have exons located on both strands of the same reference sequence or on more
  than one reference sequence, so cannot be represented by a single genomic range.
  Use 'single.strand.genes.only=FALSE' to get all the genes in a GRangesList object, or use suppressMessages() to
  suppress this message.
Error in .Call2("Rle_constructor", values, lengths, PACKAGE = "S4Vectors") : 
  Rle of type 'list' is not supported

Here is the sessionInfo and traceback:

sessionInfo( )
R version 4.2.0 (2022-04-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /usr/lib64/libopenblas-r0.3.3.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.10                 lattice_0.20-45             prettyunits_1.1.1           png_0.1-8                  
 [5] Rsamtools_2.14.0            Biostrings_2.66.0           assertthat_0.2.1            digest_0.6.31              
 [9] utf8_1.2.2                  BiocFileCache_2.6.0         R6_2.5.1                    GenomeInfoDb_1.34.7        
[13] stats4_4.2.0                RSQLite_2.2.20              httr_1.4.4                  pillar_1.8.1               
[17] zlibbioc_1.44.0             rlang_1.0.6                 GenomicFeatures_1.50.4      progress_1.2.2             
[21] curl_5.0.0                  rstudioapi_0.14             blob_1.2.3                  S4Vectors_0.36.1           
[25] Matrix_1.5-3                BiocParallel_1.32.5         stringr_1.5.0               RCurl_1.98-1.9             
[29] bit_4.0.5                   biomaRt_2.54.0              DelayedArray_0.24.0         compiler_4.2.0             
[33] rtracklayer_1.58.0          pkgconfig_2.0.3             BiocGenerics_0.44.0         tidyselect_1.2.0           
[37] KEGGREST_1.38.0             SummarizedExperiment_1.28.0 tibble_3.1.8                GenomeInfoDbData_1.2.9     
[41] matrixStats_0.63.0          IRanges_2.32.0              codetools_0.2-18            XML_3.99-0.13              
[45] fansi_1.0.4                 crayon_1.5.2                dplyr_1.0.10                dbplyr_2.3.0               
[49] GenomicAlignments_1.34.0    bitops_1.0-7                rappdirs_0.3.3              grid_4.2.0                 
[53] lifecycle_1.0.3             DBI_1.1.3                   magrittr_2.0.3              cli_3.6.0                  
[57] stringi_1.7.12              cachem_1.0.6                XVector_0.38.0              xml2_1.3.3                 
[61] ellipsis_0.3.2              filelock_1.0.2              generics_0.1.3              vctrs_0.5.2                
[65] rjson_0.2.21                restfulr_0.0.15             tools_4.2.0                 bit64_4.0.5                
[69] Biobase_2.58.0              glue_1.6.2                  MatrixGenerics_1.10.0       hms_1.1.2                  
[73] parallel_4.2.0              fastmap_1.1.0               yaml_2.3.7                  AnnotationDbi_1.60.0       
[77] BiocManager_1.30.19         GenomicRanges_1.50.2        memoise_2.0.1               BiocIO_1.8.0
> traceback()
13: .Call2("Rle_constructor", values, lengths, PACKAGE = "S4Vectors")
12: new_Rle(values, lengths)
11: Rle(seqnames)
10: Rle(seqnames)
9: .normarg_seqnames1(seqnames)
8: new_GRanges("GRanges", seqnames = seqnames, ranges = ranges, 
       strand = strand, mcols = mcols, seqinfo = seqinfo)
7: GRanges(ans_seqnames, ans_ranges, strand = ans_strand, ans_mcols, 
       seqinfo = ans_seqinfo)
6: makeGRangesFromDataFrame(df, ...)
5: makeGRangesListFromDataFrame(rangos, names.field = "group_name")
4: .createGRangesGenes.getLocusOverlap(exons.by.gene.disjoint)
3: .createGRangesGenes(genome, geneSymbols)
2: binGenome(genomeTxDb)

I have no idea how to work around this error. Can anyone help me debug this?

S4Vectors Aspli • 906 views
ADD COMMENT
0
Entering edit mode

I was having a similar problem using the hg19 genome from UCSC. I tried many different ways of accessing the gtf and txdb object (i.e. downloading directly from the ucsc website as a gtf file, using R packages to directly load the txdb etc.) Apparently there are some lines in the file that are not formatted in a way that is compatible with Aspli. If you have a relatively small genome, you can troubleshoot this in sections by building up your gtf file line by line until you hit an error with the bingenome on your resultant txdb object. However for large genomes, this is way too time consuming. In the end what worked for me was using the hg19 ensmbl gtf version instead of the hg19 known gene gtf file. If there are different versions of your gtf file, I would try these and see if any of them work.

ADD REPLY

Login before adding your answer.

Traffic: 792 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6