ASpli: jCounts() Error in av[at] <- a[at] : NAs are not allowed in subscripted assignments
3
1
Entering edit mode
Emma ▴ 20
@8a800875
Last seen 17 days ago
Australia

Hello ASpli maintainers/users!

I'm using ASpli v2.0.0, and I've run into some problems with the jCounts() function - when I run jCounts(), I get the error Error in av[at] <- a[at] : NAs are not allowed in subscripted assignments.

I'm not sure what the error means, so didn't know where to start with finding a solution.

I've included my code below (note that the low number of AS bins is normal for this genome, as there are very few annotated AS events.)

I would greatly appreciate any help with this!

Emma


> TxDb <- makeTxDbFromGFF("/data/cephfs/punim0875/emma/NMD/genome/PlasmoDB-45_Pfalciparum3D7.gff")
Import genomic features from the file as a GRanges object ... OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... OK

> features <-binGenome(TxDb)
* Number of extracted Genes = 5712
* Number of extracted Exon Bins = 11771
* Number of extracted intron bins = 8858
* Number of extracted trascripts = 5800
* Number of extracted junctions = 8766
* Number of AS bins (not include external) = 83
* Number of AS bins (include external) = 83
* Classified as: 
    ES bins = 20    (24%)
    IR bins = 3 (4%)
    Alt5'ss bins = 25   (30%)
    Alt3'ss bins = 33   (40%)
    Multiple AS bins = 2    (2%)
    classified as:
            ES bins = 1 (50%)
            IR bins = 0 (0%)
            Alt5'ss bins = 1    (50%)
            Alt3'ss bins = 0    (0%)

> bamFiles <- c("/data/cephfs/punim0875/emma/splicing/STAR_output/191220_A00692_0043_TL1911902_SR10-1_MAN-20191216_TSStrmRNA_L000Aligned.sortedByCoord.out.bam",
+               "/data/cephfs/punim0875/emma/splicing/STAR_output/191220_A00692_0043_TL1911904_SR10-2_MAN-20191216_TSStrmRNA_L000Aligned.sortedByCoord.out.bam",
+               "/data/cephfs/punim0875/emma/splicing/STAR_output/191220_A00692_0043_TL1911906_SR10-3_MAN-20191216_TSStrmRNA_L000Aligned.sortedByCoord.out.bam",
+               "/data/cephfs/punim0875/emma/splicing/STAR_output/191220_A00692_0043_TL1911903_SR12-5-1_MAN-20191216_TSStrmRNA_L000Aligned.sortedByCoord.out.bam",
+               "/data/cephfs/punim0875/emma/splicing/STAR_output/191220_A00692_0043_TL1911905_SR12-5-2_MAN-20191216_TSStrmRNA_L000Aligned.sortedByCoord.out.bam",
+               "/data/cephfs/punim0875/emma/splicing/STAR_output/191220_A00692_0043_TL1911907_SR12-5-3_MAN-20191216_TSStrmRNA_L000Aligned.sortedByCoord.out.bam")
> targets <- data.frame(row.names = c("Ctrl_1", "Ctrl_2", "Ctrl_3", "KD_1", "KD_2", "KD_3"),
+                       bam = bamFiles,
+                       genotype = c("Ctrl", "Ctrl", "Ctrl", "KD", "KD", "KD"),
+                       stringsAsFactors = FALSE)

> gbcounts <- gbCounts(
+   features,
+   targets,
+   minReadLength = 100L,
+   maxISize = 3000,
+   minAnchor = 10, 
+   strandMode = 2)
Summarizing Ctrl_1
ETA: 30 min
Summarizing Ctrl_2
ETA: 23 min
Summarizing Ctrl_3
ETA: 15 min
Summarizing KD_1
ETA: 11 min
Summarizing KD_2
ETA: 6 min
Summarizing KD_3
ETA: 0 min

> asd <- jCounts(counts = gbcounts,
+                features = features,
+                minReadLength = 100L, 
+                strandMode = 2)
Error in av[at] <- a[at] : NAs are not allowed in subscripted assignments

And here is my session info:

> sessionInfo()
R version 4.0.5 (2021-03-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /usr/lib64/libopenblasp-r0.3.3.so

locale:
 [1] LC_CTYPE=en_AU.UTF-8       LC_NUMERIC=C               LC_TIME=en_AU.UTF-8       
 [4] LC_COLLATE=en_AU.UTF-8     LC_MONETARY=en_AU.UTF-8    LC_MESSAGES=en_AU.UTF-8   
 [7] LC_PAPER=en_AU.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] GenomicFeatures_1.42.3 GenomicRanges_1.42.0   GenomeInfoDb_1.26.7    ASpli_2.0.0           
 [5] AnnotationDbi_1.52.0   IRanges_2.24.1         S4Vectors_0.28.1       Biobase_2.50.0        
 [9] BiocGenerics_0.36.1    edgeR_3.32.1           limma_3.46.0          

loaded via a namespace (and not attached):
  [1] colorspace_2.0-1            ellipsis_0.3.2              biovizBase_1.38.0          
  [4] htmlTable_2.1.0             XVector_0.30.0              base64enc_0.1-3            
  [7] dichromat_2.0-0             rstudioapi_0.13             DT_0.18                    
 [10] bit64_4.0.5                 fansi_0.4.2                 xml2_1.3.2                 
 [13] splines_4.0.5               cachem_1.0.4                knitr_1.33                 
 [16] Formula_1.2-4               Rsamtools_2.6.0             cluster_2.1.1              
 [19] dbplyr_2.1.1                png_0.1-7                   BiocManager_1.30.15        
 [22] compiler_4.0.5              httr_1.4.2                  backports_1.2.1            
 [25] lazyeval_0.2.2              assertthat_0.2.1            Matrix_1.3-2               
 [28] fastmap_1.1.0               htmltools_0.5.1.1           prettyunits_1.1.1          
 [31] tools_4.0.5                 igraph_1.2.6                gtable_0.3.0               
 [34] glue_1.4.2                  GenomeInfoDbData_1.2.4      dplyr_1.0.6                
 [37] rappdirs_0.3.3              tinytex_0.31                Rcpp_1.0.6                 
 [40] vctrs_0.3.8                 Biostrings_2.58.0           rtracklayer_1.50.0         
 [43] xfun_0.22                   stringr_1.4.0               lifecycle_1.0.0            
 [46] ensembldb_2.14.1            XML_3.99-0.6                zlibbioc_1.36.0            
 [49] MASS_7.3-53.1               scales_1.1.1                BiocStyle_2.18.1           
 [52] BSgenome_1.58.0             VariantAnnotation_1.36.0    ProtGenerics_1.22.0        
 [55] hms_1.0.0                   MatrixGenerics_1.2.1        SummarizedExperiment_1.20.0
 [58] AnnotationFilter_1.14.0     RColorBrewer_1.1-2          yaml_2.2.1                 
 [61] curl_4.3.1                  memoise_2.0.0               gridExtra_2.3              
 [64] ggplot2_3.3.3               UpSetR_1.4.0                biomaRt_2.46.3             
 [67] rpart_4.1-15                latticeExtra_0.6-29         stringi_1.6.1              
 [70] RSQLite_2.2.7               checkmate_2.0.0             BiocParallel_1.24.1        
 [73] rlang_0.4.11                pkgconfig_2.0.3             matrixStats_0.58.0         
 [76] bitops_1.0-7                evaluate_0.14               lattice_0.20-41            
 [79] purrr_0.3.4                 GenomicAlignments_1.26.0    htmlwidgets_1.5.3          
 [82] bit_4.0.4                   tidyselect_1.1.1            plyr_1.8.6                 
 [85] magrittr_2.0.1              R6_2.5.0                    generics_0.1.0             
 [88] Hmisc_4.5-0                 DelayedArray_0.16.3         DBI_1.1.1                  
 [91] pillar_1.6.0                foreign_0.8-81              survival_3.2-10            
 [94] RCurl_1.98-1.3              nnet_7.3-15                 tibble_3.1.1               
 [97] crayon_1.4.1                utf8_1.2.1                  BiocFileCache_1.14.0       
[100] rmarkdown_2.8               jpeg_0.1-8.1                progress_1.2.2             
[103] locfit_1.5-9.4              grid_4.0.5                  data.table_1.14.0          
[106] blob_1.2.1                  digest_0.6.27               tidyr_1.1.3                
[109] openssl_1.4.4               munsell_0.5.0               Gviz_1.34.1                
[112] askpass_1.1
ASpli • 255 views
ADD COMMENT
1
Entering edit mode
emancini ▴ 50
@emancini-10867
Last seen 23 days ago
Argentina

Hi Emma, thanks for using ASpli

can you check how is your junction counts object?

head(gbcounts@junction.counts)

Just to confirm your parameter strandMode=2, is something special with your sample regarding strand specificty?

In the meantime, can you try with our devel version?

http://bioconductor.org/packages/devel/bioc/html/ASpli.html

We didd correct a bug recently regarding NA junctions. Let me know in case you need hekp how to use it,

thanks

Estefi

ADD COMMENT
0
Entering edit mode

Thanks for your reply Estefi.

Here is what my gbcounts@junction.counts looks like (the genes in head() are from organellar genomes):

> head(gbcounts@junction.counts)
                   junction  gene strand multipleHit symbol gene_coordinates                             bin_spanned j_within_bin Ctrl_1 Ctrl_2 Ctrl_3 KD_1 KD_2 KD_3
Pf_M76611.150.2823    noHit noHit      *           -      -                - PF3D7_API04200:E001;PF3D7_API04200:E002            -      0      0      1    0    0    0
Pf_M76611.151.781     noHit noHit      *           -      -                -                     PF3D7_API04200:E001            -      0      0      0    3    0    0
Pf_M76611.161.1859    noHit noHit      *           -      -                - PF3D7_API04200:E001;PF3D7_API04200:E002            -      0      1      0    0    0    0
Pf_M76611.164.1879    noHit noHit      *           -      -                - PF3D7_API04200:E001;PF3D7_API04200:E002            -      0      0      1    0    0    0
Pf_M76611.169.740     noHit noHit      *           -      -                -                     PF3D7_API04200:E001            -      0      0      0    0    0    1
Pf_M76611.340.451     noHit noHit      *           -      -                -                                       -            -      1      4      1    6    1    8
> tail(gbcounts@junction.counts)
                            junction  gene strand multipleHit symbol gene_coordinates bin_spanned j_within_bin Ctrl_1 Ctrl_2 Ctrl_3 KD_1 KD_2 KD_3
Pf3D7_14_v3.3291613.3291666    noHit noHit      *           -      -                -           -            -      0      0      0    0    0    2
Pf3D7_14_v3.3291613.3291713    noHit noHit      *           -      -                -           -            -      2      0      0    0    0    0
Pf3D7_14_v3.3291613.3291899    noHit noHit      *           -      -                -           -            -      2      0      0    0    0    0
Pf3D7_14_v3.3291747.3291796    noHit noHit      *           -      -                -           -            -      0      2      0    0    0    0
Pf3D7_14_v3.3291816.3291829    noHit noHit      *           -      -                -           -            -      0      0      0    2    0    0
Pf3D7_API_v3.8303.8368         noHit noHit      *           -      -                -           -            -      0      1      0    0    0    0

These are the same as junction counts I got from running the same data through ASpli v1.10.0.

And to confirm, the reason I put strandMode = 2 is because it was a TruSeq library prep. I read in the ?strandMode info that this number should be the same as what is used for featureCounts() in the Rsubread package (where I use strandSpecific = 2).

I will try the devel version now, as you suggest and let you know if that works. Thank you very much for your help!

ADD REPLY
0
Entering edit mode

Hi Estefi,

I tried to do the devel version install, but it requires R 4.1, which I don't have on my cluster. It's not possible for me to upgrade to R 4.1 myself. Is there another way I can use the devel version or fix the error I'm getting? Thanks! :)

ADD REPLY
1
Entering edit mode
emancini ▴ 50
@emancini-10867
Last seen 23 days ago
Argentina

Thanks Emma.

I see there are junctions. Cool!

You can try to download devel version here:

  • Source Package ASpli_2.1.3.tar.gz

and load using: devtools::load_all("ASpli/")

ADD COMMENT
1
Entering edit mode

Hi Estefi,

I just tried the dev version but got the same error. This was the traceback:

> asd <- jCounts(counts = gbcounts,
+                features = features, 
+                minReadLength = 100L, 
+                strandMode = 2)
 Error in av[at] <- a[at] : NAs are not allowed in subscripted assignments 
6.
vecMin(filter[, i], byCond[, j]) at ASDiscover_functions.R#35
5.
.filterJunctionBySample(df0 = df0, targets = targets, threshold = threshold) at ASpli_methods.R#653
4.
AsDiscover(counts = counts, targets = NULL, features = features, 
    bam = NULL, readLength = minReadLength, threshold = threshold, 
    cores = 1, minAnchor = minAnchor, libType = libType, strandMode = strandMode, 
    alignFastq = alignFastq, dropBAM = dropBAM) at ASpli_methods.R#583
3.
AsDiscover(counts = counts, targets = NULL, features = features, 
    bam = NULL, readLength = minReadLength, threshold = threshold, 
    cores = 1, minAnchor = minAnchor, libType = libType, strandMode = strandMode, 
    alignFastq = alignFastq, dropBAM = dropBAM) at ASpli_methods.R#573
2.
jCounts(counts = gbcounts, features = features, minReadLength = 100L, 
    strandMode = 2) at ASpli_methods.R#545
1.
jCounts(counts = gbcounts, features = features, minReadLength = 100L, 
    strandMode = 2)
ADD REPLY
1
Entering edit mode

Hi Emma

We would like to check your gbcounts object in case we find something weird in junctions (could be) can you upload in some place? You can send me the link by email.

In the meantime, can you try to run jCounts function, using this parameter: threshold = 1. It could be a matter of coverage. For the sake of curiosity, do you find some "known" junction in your data?

Thanks a lot and I am sorry for the delay in the answer,

Estefi

ADD REPLY
1
Entering edit mode
emancini ▴ 50
@emancini-10867
Last seen 23 days ago
Argentina

Hi Emma,

It seems we fix the bug. Can you try again, using our devel version?

http://bioconductor.org/packages/devel/bioc/html/ASpli.html

thanks for your patience,

Estefi

ADD COMMENT
0
Entering edit mode

Thanks Estefi, I'm not longer getting that error. I am now getting a different error when I try to run the jCounts() function, but I will email you about it rather than start a new thread as its in the dev version.

Emma

ADD REPLY

Login before adding your answer.

Traffic: 246 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6