SpliceWiz collateData error
2
0
Entering edit mode
@aca36346
Last seen 2 hours ago
Italy

Hi,

I am running SpliceWiz following these indications here https://bioconductor.org/packages/devel/bioc/vignettes/SpliceWiz/inst/doc/SW_QuickStart.html#Quick-Start.

I obtained my reference by running

require(AnnotationHub)
ah = AnnotationHub()
query(ah, c("Homo Sapiens", "release-94"))

ref_path = file.path(tempdir(), "Reference")
buildRef(
    reference_path = ref_path,
    fasta = "AH65745",
    gtf = "AH64631",
    genome_type = "hg38" # take into account mappability exclusions
)

and provided BAMs sorted by read name (using samtools collate). Everything works fine until the following step:

nxtse_path <- file.path(tempdir(), "NxtSE_output")
collateData(
    Experiment = expr,
    reference_path = ref_path,
    output_path = nxtse_path
)
Error in stop_if_wrong_length("'seqnames'", ans_len) : 'seqnames' must have the length of the object to construct (1) or length 1
11.
stop(wmsg(what, " must have the length of the object ", "to construct (", ans_len, ") or length 1"))
10.
stop_if_wrong_length("'seqnames'", ans_len)
9.
new_GRanges("GRanges", seqnames = seqnames, ranges = ranges, strand = strand, mcols = mcols, seqinfo = seqinfo)
8.
GRanges(seqnames = as.character(seqnames), ranges = IRanges(start = start, end = start + 1), strand = "+")
7.
eval(substitute(expr), data, enclos = parent.frame())
6.
eval(substitute(expr), data, enclos = parent.frame())
5.
with.default(junc.common.unanno, GRanges(seqnames = as.character(seqnames), ranges = IRanges(start = start, end = start + 1), strand = "+"))
4.
with(junc.common.unanno, GRanges(seqnames = as.character(seqnames), ranges = IRanges(start = start, end = start + 1), strand = "+"))
3.
.collateData_junc_annotate(2, reference_path, norm_output_path, lowMemoryMode)
2.
.collateData_annotate(reference_path, norm_output_path, stranded, novelSplicing, lowMemoryMode, minSamplesWithJunc = novelSplicing_minSamples, minSamplesAboveJuncThreshold = novelSplicing_minSamplesAboveThreshold, novelSplicing_requireOneAnnotatedSJ = novelSplicing_requireOneAnnotatedSJ)
1.
collateData(Experiment = expr, reference_path = ref_path, output_path = nxtse_path)


sessionInfo( )
R version 4.2.0 (2022-04-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.4 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] fstcore_0.9.14      AnnotationHub_3.6.0 BiocFileCache_2.6.0 dbplyr_2.2.1        BiocGenerics_0.44.0 pheatmap_1.0.12    
 [7] forcats_0.5.1       stringr_1.4.0       dplyr_1.0.9         purrr_0.3.4         readr_2.1.2         tidyr_1.2.0        
[13] tibble_3.1.7        ggplot2_3.3.6       tidyverse_1.3.1     SpliceWiz_1.0.2     NxtIRFdata_1.4.0   

loaded via a namespace (and not attached):
  [1] readxl_1.4.0                  backports_1.4.1               lazyeval_0.2.2                shinydashboard_0.7.2         
  [5] splines_4.2.0                 BiocParallel_1.31.8           GenomeInfoDb_1.34.3           digest_0.6.29                
  [9] foreach_1.5.2                 ca_0.71.1                     htmltools_0.5.2               viridis_0.6.2                
 [13] fansi_1.0.3                   magrittr_2.0.3                memoise_2.0.1                 BSgenome_1.66.1              
 [17] tzdb_0.3.0                    shinyFiles_0.9.3              Biostrings_2.66.0             annotate_1.76.0              
 [21] modelr_0.1.8                  matrixStats_0.62.0            R.utils_2.11.0                prettyunits_1.1.1            
 [25] colorspace_2.0-3              rvest_1.0.2                   blob_1.2.3                    rappdirs_0.3.3               
 [29] haven_2.5.0                   xfun_0.31                     crayon_1.5.1                  RCurl_1.98-1.7               
 [33] jsonlite_1.8.0                genefilter_1.79.0             survival_3.3-1                iterators_1.0.14             
 [37] glue_1.6.2                    registry_0.5-1                gtable_0.3.0                  zlibbioc_1.44.0              
 [41] XVector_0.38.0                webshot_0.5.4                 DelayedArray_0.23.1           Rhdf5lib_1.20.0              
 [45] HDF5Array_1.26.0              scales_1.2.0                  DBI_1.1.3                     Rcpp_1.0.8.3                 
 [49] viridisLite_0.4.0             xtable_1.8-4                  progress_1.2.2                bit_4.0.4                    
 [53] stats4_4.2.0                  DT_0.26                       htmlwidgets_1.5.4             httr_1.4.3                   
 [57] RColorBrewer_1.1-3            ellipsis_0.3.2                pkgconfig_2.0.3               XML_3.99-0.10                
 [61] R.methodsS3_1.8.2             sass_0.4.1                    utf8_1.2.2                    tidyselect_1.1.2             
 [65] rlang_1.0.6                   later_1.3.0                   AnnotationDbi_1.60.0          cellranger_1.1.0             
 [69] munsell_0.5.0                 BiocVersion_3.16.0            tools_4.2.0                   cachem_1.0.6                 
 [73] cli_3.4.1                     generics_0.1.2                RSQLite_2.2.14                broom_0.8.0                  
 [77] fastmap_1.1.0                 heatmaply_1.4.2               yaml_2.3.5                    ompBAM_1.2.0                 
 [81] knitr_1.39                    bit64_4.0.5                   fs_1.5.2                      KEGGREST_1.38.0              
 [85] dendextend_1.16.0             sparseMatrixStats_1.10.0      mime_0.12                     rhandsontable_0.3.8          
 [89] R.oo_1.25.0                   xml2_1.3.3                    compiler_4.2.0                rstudioapi_0.13              
 [93] plotly_4.10.0                 filelock_1.0.2                curl_4.3.2                    png_0.1-7                    
 [97] interactiveDisplayBase_1.36.0 reprex_2.0.1                  stringi_1.7.6                 bslib_0.3.1                  
[101] lattice_0.20-45               Matrix_1.5-1                  vctrs_0.4.1                   pillar_1.7.0                 
[105] lifecycle_1.0.1               rhdf5filters_1.10.0           BiocManager_1.30.18           jquerylib_0.1.4              
[109] data.table_1.14.2             bitops_1.0-7                  seriation_1.4.1               httpuv_1.6.5                 
[113] rtracklayer_1.57.0            GenomicRanges_1.50.1          R6_2.5.1                      BiocIO_1.8.0                 
[117] promises_1.2.0.1              TSP_1.2-1                     gridExtra_2.3                 IRanges_2.32.0               
[121] codetools_0.2-18              assertthat_0.2.1              rhdf5_2.42.0                  SummarizedExperiment_1.28.0  
[125] rjson_0.2.21                  withr_2.5.0                   shinyWidgets_0.7.6            GenomicAlignments_1.34.0     
[129] Rsamtools_2.14.0              S4Vectors_0.36.0              GenomeInfoDbData_1.2.8        parallel_4.2.0               
[133] hms_1.1.1                     fst_0.9.8                     grid_4.2.0                    DelayedMatrixStats_1.20.0    
[137] MatrixGenerics_1.10.0         lubridate_1.8.0               Biobase_2.58.0                shiny_1.7.1                  
[141] restfulr_0.0.15

I understand this is the output of the internal function new_GRanges() from the GenomicRanges package, however I can't figure out what's wrong with my input files.

Thank you

SpliceWiz • 102 views
ADD COMMENT
1
Entering edit mode

Hi Chiara,

Yes, the issue appears to be in collateData(), whereby there is a sanity check missing. After obtaining a list of novel junction reads, it removes those with seqnames that do not match that of the provided reference. But then it doesn't check whether there are any remaining after filtering before proceeding further.

I will patch this in an upcoming update.

Another user encountered this issue when using Gencode-aligned BAM files and Ensembl reference. In the meantime, please try creating a new reference using the same reference FASTA / GTF files as that used to align your BAM files. I hope that should resolve this issue.

ADD REPLY
0
Entering edit mode
@aca36346
Last seen 2 hours ago
Italy

For anyone interested, see this post https://github.com/alexchwong/SpliceWiz/issues/33

ADD COMMENT

Login before adding your answer.

Traffic: 378 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6