Entering edit mode
Hi,
I am running SpliceWiz following these indications here https://bioconductor.org/packages/devel/bioc/vignettes/SpliceWiz/inst/doc/SW_QuickStart.html#Quick-Start.
I obtained my reference by running
require(AnnotationHub)
ah = AnnotationHub()
query(ah, c("Homo Sapiens", "release-94"))
ref_path = file.path(tempdir(), "Reference")
buildRef(
reference_path = ref_path,
fasta = "AH65745",
gtf = "AH64631",
genome_type = "hg38" # take into account mappability exclusions
)
and provided BAMs sorted by read name (using samtools collate). Everything works fine until the following step:
nxtse_path <- file.path(tempdir(), "NxtSE_output")
collateData(
Experiment = expr,
reference_path = ref_path,
output_path = nxtse_path
)
Error in stop_if_wrong_length("'seqnames'", ans_len) : 'seqnames' must have the length of the object to construct (1) or length 1
11.
stop(wmsg(what, " must have the length of the object ", "to construct (", ans_len, ") or length 1"))
10.
stop_if_wrong_length("'seqnames'", ans_len)
9.
new_GRanges("GRanges", seqnames = seqnames, ranges = ranges, strand = strand, mcols = mcols, seqinfo = seqinfo)
8.
GRanges(seqnames = as.character(seqnames), ranges = IRanges(start = start, end = start + 1), strand = "+")
7.
eval(substitute(expr), data, enclos = parent.frame())
6.
eval(substitute(expr), data, enclos = parent.frame())
5.
with.default(junc.common.unanno, GRanges(seqnames = as.character(seqnames), ranges = IRanges(start = start, end = start + 1), strand = "+"))
4.
with(junc.common.unanno, GRanges(seqnames = as.character(seqnames), ranges = IRanges(start = start, end = start + 1), strand = "+"))
3.
.collateData_junc_annotate(2, reference_path, norm_output_path, lowMemoryMode)
2.
.collateData_annotate(reference_path, norm_output_path, stranded, novelSplicing, lowMemoryMode, minSamplesWithJunc = novelSplicing_minSamples, minSamplesAboveJuncThreshold = novelSplicing_minSamplesAboveThreshold, novelSplicing_requireOneAnnotatedSJ = novelSplicing_requireOneAnnotatedSJ)
1.
collateData(Experiment = expr, reference_path = ref_path, output_path = nxtse_path)
sessionInfo( )
R version 4.2.0 (2022-04-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.4 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] fstcore_0.9.14 AnnotationHub_3.6.0 BiocFileCache_2.6.0 dbplyr_2.2.1 BiocGenerics_0.44.0 pheatmap_1.0.12
[7] forcats_0.5.1 stringr_1.4.0 dplyr_1.0.9 purrr_0.3.4 readr_2.1.2 tidyr_1.2.0
[13] tibble_3.1.7 ggplot2_3.3.6 tidyverse_1.3.1 SpliceWiz_1.0.2 NxtIRFdata_1.4.0
loaded via a namespace (and not attached):
[1] readxl_1.4.0 backports_1.4.1 lazyeval_0.2.2 shinydashboard_0.7.2
[5] splines_4.2.0 BiocParallel_1.31.8 GenomeInfoDb_1.34.3 digest_0.6.29
[9] foreach_1.5.2 ca_0.71.1 htmltools_0.5.2 viridis_0.6.2
[13] fansi_1.0.3 magrittr_2.0.3 memoise_2.0.1 BSgenome_1.66.1
[17] tzdb_0.3.0 shinyFiles_0.9.3 Biostrings_2.66.0 annotate_1.76.0
[21] modelr_0.1.8 matrixStats_0.62.0 R.utils_2.11.0 prettyunits_1.1.1
[25] colorspace_2.0-3 rvest_1.0.2 blob_1.2.3 rappdirs_0.3.3
[29] haven_2.5.0 xfun_0.31 crayon_1.5.1 RCurl_1.98-1.7
[33] jsonlite_1.8.0 genefilter_1.79.0 survival_3.3-1 iterators_1.0.14
[37] glue_1.6.2 registry_0.5-1 gtable_0.3.0 zlibbioc_1.44.0
[41] XVector_0.38.0 webshot_0.5.4 DelayedArray_0.23.1 Rhdf5lib_1.20.0
[45] HDF5Array_1.26.0 scales_1.2.0 DBI_1.1.3 Rcpp_1.0.8.3
[49] viridisLite_0.4.0 xtable_1.8-4 progress_1.2.2 bit_4.0.4
[53] stats4_4.2.0 DT_0.26 htmlwidgets_1.5.4 httr_1.4.3
[57] RColorBrewer_1.1-3 ellipsis_0.3.2 pkgconfig_2.0.3 XML_3.99-0.10
[61] R.methodsS3_1.8.2 sass_0.4.1 utf8_1.2.2 tidyselect_1.1.2
[65] rlang_1.0.6 later_1.3.0 AnnotationDbi_1.60.0 cellranger_1.1.0
[69] munsell_0.5.0 BiocVersion_3.16.0 tools_4.2.0 cachem_1.0.6
[73] cli_3.4.1 generics_0.1.2 RSQLite_2.2.14 broom_0.8.0
[77] fastmap_1.1.0 heatmaply_1.4.2 yaml_2.3.5 ompBAM_1.2.0
[81] knitr_1.39 bit64_4.0.5 fs_1.5.2 KEGGREST_1.38.0
[85] dendextend_1.16.0 sparseMatrixStats_1.10.0 mime_0.12 rhandsontable_0.3.8
[89] R.oo_1.25.0 xml2_1.3.3 compiler_4.2.0 rstudioapi_0.13
[93] plotly_4.10.0 filelock_1.0.2 curl_4.3.2 png_0.1-7
[97] interactiveDisplayBase_1.36.0 reprex_2.0.1 stringi_1.7.6 bslib_0.3.1
[101] lattice_0.20-45 Matrix_1.5-1 vctrs_0.4.1 pillar_1.7.0
[105] lifecycle_1.0.1 rhdf5filters_1.10.0 BiocManager_1.30.18 jquerylib_0.1.4
[109] data.table_1.14.2 bitops_1.0-7 seriation_1.4.1 httpuv_1.6.5
[113] rtracklayer_1.57.0 GenomicRanges_1.50.1 R6_2.5.1 BiocIO_1.8.0
[117] promises_1.2.0.1 TSP_1.2-1 gridExtra_2.3 IRanges_2.32.0
[121] codetools_0.2-18 assertthat_0.2.1 rhdf5_2.42.0 SummarizedExperiment_1.28.0
[125] rjson_0.2.21 withr_2.5.0 shinyWidgets_0.7.6 GenomicAlignments_1.34.0
[129] Rsamtools_2.14.0 S4Vectors_0.36.0 GenomeInfoDbData_1.2.8 parallel_4.2.0
[133] hms_1.1.1 fst_0.9.8 grid_4.2.0 DelayedMatrixStats_1.20.0
[137] MatrixGenerics_1.10.0 lubridate_1.8.0 Biobase_2.58.0 shiny_1.7.1
[141] restfulr_0.0.15
I understand this is the output of the internal function new_GRanges() from the GenomicRanges package, however I can't figure out what's wrong with my input files.
Thank you
Hi Chiara,
Yes, the issue appears to be in collateData(), whereby there is a sanity check missing. After obtaining a list of novel junction reads, it removes those with seqnames that do not match that of the provided reference. But then it doesn't check whether there are any remaining after filtering before proceeding further.
I will patch this in an upcoming update.
Another user encountered this issue when using Gencode-aligned BAM files and Ensembl reference. In the meantime, please try creating a new reference using the same reference FASTA / GTF files as that used to align your BAM files. I hope that should resolve this issue.