Dear ASpli maintainer team/users!
I was wondering if you could help me with an issue I have encountered when running jCounts. I got the error message as shown in the title and below. I've looked into "gbcounts@junction.counts" and I think the error was due to 325,589 rows that do not contain "gene_coordinates" information (shown as "-" instead of coordinate information). I wanted to filter out these rows and run jCounts again on the remaining rows (550,301 rows) but so far have been unsuccessful. I've tried to subset "gbcounts@junction.counts" by running subset(gbcounts@junction.counts, gene_coordinates != "-") but it only worked on the dataframe. I was thinking to somehow add this new "gbcounts@junction.counts" back into the gbcounts object but don't know how. I would appreciate it very much if anyone can give me some guidance on this.
Best regards,
Suong
```BiocManager::install("ASpli")
```library(ASpli)
```library(GenomicFeatures)
```gtfFileName <- "/home/suong/Documents/Effluxer project/novel_transcripts/ASpli_R_Oct2022/Pvulgaris_442_v2.1.gene_exons.gff3"
```genomeTxDb <- makeTxDbFromGFF( gtfFileName )
```features <- binGenome( genomeTxDb )
```geneCoord <- featuresg( features )
```binCoord <- featuresb( features )
```junctionCoord <- featuresj( features )
```BAMFiles <- c("/home/suong/Documents/Effluxer project/novel_transcripts/bam_bai/EM_16DPA_1_valAligned.sortedByCoord.out.bam", "/home/suong/Documents/Effluxer project/novel_transcripts/bam_bai/EM_16DPA_2_valAligned.sortedByCoord.out.bam", "/home/suong/Documents/Effluxer project/novel_transcripts/bam_bai/EM_16DPA_3_valAligned.sortedByCoord.out.bam", "/home/suong/Documents/Effluxer project/novel_transcripts/bam_bai/EM_16DPA_4_valAligned.sortedByCoord.out.bam", "/home/suong/Documents/Effluxer project/novel_transcripts/bam_bai/EM_21DPA_1_valAligned.sortedByCoord.out.bam", "/home/suong/Documents/Effluxer project/novel_transcripts/bam_bai/EM_21DPA_2_valAligned.sortedByCoord.out.bam", "/home/suong/Documents/Effluxer project/novel_transcripts/bam_bai/EM_21DPA_3_valAligned.sortedByCoord.out.bam", "/home/suong/Documents/Effluxer project/novel_transcripts/bam_bai/EM_21DPA_4_valAligned.sortedByCoord.out.bam", "/home/suong/Documents/Effluxer project/novel_transcripts/bam_bai/EM_26DPA_1_valAligned.sortedByCoord.out.bam", "/home/suong/Documents/Effluxer project/novel_transcripts/bam_bai/EM_26DPA_2_valAligned.sortedByCoord.out.bam", "/home/suong/Documents/Effluxer project/novel_transcripts/bam_bai/EM_26DPA_3_valAligned.sortedByCoord.out.bam", "/home/suong/Documents/Effluxer project/novel_transcripts/bam_bai/EM_26DPA_4_valAligned.sortedByCoord.out.bam", "/home/suong/Documents/Effluxer project/novel_transcripts/bam_bai/SC_16DPA_1_valAligned.sortedByCoord.out.bam", "/home/suong/Documents/Effluxer project/novel_transcripts/bam_bai/SC_16DPA_2_valAligned.sortedByCoord.out.bam", "/home/suong/Documents/Effluxer project/novel_transcripts/bam_bai/SC_16DPA_3_valAligned.sortedByCoord.out.bam", "/home/suong/Documents/Effluxer project/novel_transcripts/bam_bai/SC_16DPA_4_valAligned.sortedByCoord.out.bam", "/home/suong/Documents/Effluxer project/novel_transcripts/bam_bai/SC_21DPA_1_valAligned.sortedByCoord.out.bam", "/home/suong/Documents/Effluxer project/novel_transcripts/bam_bai/SC_21DPA_2_valAligned.sortedByCoord.out.bam", "/home/suong/Documents/Effluxer project/novel_transcripts/bam_bai/SC_21DPA_3_valAligned.sortedByCoord.out.bam", "/home/suong/Documents/Effluxer project/novel_transcripts/bam_bai/SC_21DPA_4_valAligned.sortedByCoord.out.bam", "/home/suong/Documents/Effluxer project/novel_transcripts/bam_bai/SC_26DPA_1_valAligned.sortedByCoord.out.bam", "/home/suong/Documents/Effluxer project/novel_transcripts/bam_bai/SC_26DPA_2_valAligned.sortedByCoord.out.bam", "/home/suong/Documents/Effluxer project/novel_transcripts/bam_bai/SC_26DPA_3_valAligned.sortedByCoord.out.bam", "/home/suong/Documents/Effluxer project/novel_transcripts/bam_bai/SC_26DPA_4_valAligned.sortedByCoord.out.bam")
```(targets <- data.frame( bam = BAMFiles, tissue = c( 'Embryo','Embryo','Embryo','Embryo','Embryo','Embryo','Embryo','Embryo','Embryo','Embryo','Embryo','Embryo', 'Seedcoat','Seedcoat','Seedcoat','Seedcoat','Seedcoat','Seedcoat','Seedcoat','Seedcoat','Seedcoat','Seedcoat','Seedcoat','Seedcoat'), time = c( '16DPA', '16DPA', '16DPA', '16DPA', '21DPA', '21DPA', '21DPA', '21DPA', '26DPA', '26DPA', '26DPA', '26DPA', '16DPA', '16DPA', '16DPA', '16DPA', '21DPA', '21DPA', '21DPA', '21DPA', '26DPA', '26DPA', '26DPA', '26DPA'), stringsAsFactors = FALSE ))
```getConditions( targets )
```mBAMs <- data.frame(bam = sub("_[1_2_3_4_]","",targets$bam[c(1,5,9,13,17,21)]), condition= c("Embryo_16DPA","Embryo_21DPA","Embryo_26DPA", "Seedcoat_16DPA", "Seedcoat_21DPA", "Seedcoat_26DPA"))
Read counting against annotated features:
```gbcounts <- gbCounts(features=features, targets=targets, minReadLength = 100, maxISize = 50000)
```gbcounts
Junction-based de-novo counting and splicing signal estimation:
```asd <- jCounts(counts=gbcounts, features=features, minReadLength=100)
```asd
My error output
```asd <- jCounts(counts=gbcounts, features=features, minReadLength=100)
Error in h(simpleError(msg, call)) : error in evaluating the argument 'x' in selecting a method for function 'sort': 'start' or 'end' cannot contain NAs
In addition: Warning messages:
1: In matrix(unlist(strsplit(jnames, "[.]")), byrow = TRUE, ncol = 3) : data length [1646215] is not a sub-multiple or multiple of the number of rows [548739]
2: In .createGRangesExpJunctions(junctionNames) : NAs introduced by coercion
Here is my junction.counts data
sessionInfo( )
```sessionInfo( )
R version 4.2.1 (2022-06-23) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Linux Mint 20.1
Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
locale:
1 LC_CTYPE=en_AU.UTF-8 LC_NUMERIC=C LC_TIME=en_AU.UTF-8 LC_COLLATE=en_AU.UTF-8 LC_MONETARY=en_AU.UTF-8
[6] LC_MESSAGES=en_AU.UTF-8 LC_PAPER=en_AU.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C
attached base packages: 1 stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
1 dplyr_1.0.10 GenomicFeatures_1.48.4 GenomicRanges_1.48.0 GenomeInfoDb_1.32.4 ASpli_2.6.0 AnnotationDbi_1.58.0
[7] IRanges_2.30.1 S4Vectors_0.34.0 Biobase_2.56.0 BiocGenerics_0.42.0 edgeR_3.38.4 limma_3.52.4
loaded via a namespace (and not attached):
1 colorspace_2.0-3 deldir_1.0-6 rjson_0.2.21 ellipsis_0.3.2 htmlTable_2.4.1
[6] biovizBase_1.44.0 XVector_0.36.0 base64enc_0.1-3 dichromat_2.0-0.1 rstudioapi_0.14
[11] DT_0.25 bit64_4.0.5 fansi_1.0.3 xml2_1.3.3 codetools_0.2-18
[16] splines_4.2.1 cachem_1.0.6 knitr_1.40 Formula_1.2-4 Rsamtools_2.12.0
[21] cluster_2.1.4 dbplyr_2.2.1 png_0.1-7 BiocManager_1.30.18 compiler_4.2.1
[26] httr_1.4.4 backports_1.4.1 lazyeval_0.2.2 assertthat_0.2.1 Matrix_1.5-1
[31] fastmap_1.1.0 cli_3.4.1 htmltools_0.5.3 prettyunits_1.1.1 tools_4.2.1
[36] igraph_1.3.5 gtable_0.3.1 glue_1.6.2 GenomeInfoDbData_1.2.8 rappdirs_0.3.3
[41] Rcpp_1.0.9 vctrs_0.4.2 Biostrings_2.64.1 rtracklayer_1.56.1 xfun_0.33
[46] stringr_1.4.1 lifecycle_1.0.3 ensembldb_2.20.2 restfulr_0.0.15 statmod_1.4.37
[51] XML_3.99-0.11 zlibbioc_1.42.0 MASS_7.3-58 scales_1.2.1 BiocStyle_2.24.0
[56] BSgenome_1.64.0 VariantAnnotation_1.42.1 ProtGenerics_1.28.0 hms_1.1.2 MatrixGenerics_1.8.1
[61] SummarizedExperiment_1.26.1 AnnotationFilter_1.20.0 RColorBrewer_1.1-3 yaml_2.3.5 curl_4.3.3
[66] memoise_2.0.1 gridExtra_2.3 ggplot2_3.3.6 UpSetR_1.4.0 rpart_4.1.16
[71] biomaRt_2.52.0 latticeExtra_0.6-30 stringi_1.7.8 RSQLite_2.2.18 BiocIO_1.6.0
[76] checkmate_2.1.0 filelock_1.0.2 BiocParallel_1.30.3 rlang_1.0.6 pkgconfig_2.0.3
[81] matrixStats_0.62.0 bitops_1.0-7 evaluate_0.17 lattice_0.20-45 purrr_0.3.5
[86] htmlwidgets_1.5.4 GenomicAlignments_1.32.1 cowplot_1.1.1 bit_4.0.4 tidyselect_1.2.0
[91] plyr_1.8.7 magrittr_2.0.3 R6_2.5.1 generics_0.1.3 Hmisc_4.7-1
[96] DelayedArray_0.22.0 DBI_1.1.3 foreign_0.8-82 pillar_1.8.1 nnet_7.3-17
[101] survival_3.4-0 KEGGREST_1.36.3 RCurl_1.98-1.9 tibble_3.1.8 crayon_1.5.2
[106] interp_1.1-3 utf8_1.2.2 BiocFileCache_2.4.0 rmarkdown_2.17 jpeg_0.1-9
[111] progress_1.2.2 locfit_1.5-9.6 grid_4.2.1 data.table_1.14.2 blob_1.2.3
[116] digest_0.6.29 pbmcapply_1.5.1 tidyr_1.2.1 munsell_0.5.0 Gviz_1.40.1
Hey there! I was having this exact problem and that same error. I had no clue what could be going wrong because it only showed up with specific BAM files and not with others. I ended up realizing that when I used TrimGalore (for removing low quality reads and adapters) sometimes this error showed up, but if I processed the same library with Trimmomatic the error disappeared. That's how I got my way around the problem (nowhere near actually solving it hahaha, but enough for me to move on). Regards!
Hello, I am late to the party, but I had the same issue, with some datasets, and I solved adding the following line of code:
Example:
Cheerio, M
Problem might have been solved in devel version 2.11.1 (https://www.bioconductor.org/packages/devel/bioc/html/ASpli.html)