Question

ASpli package gives an error in jCounts function and suggest a parameter change in granges()

0

Entering edit mode

hamza_karakurt ▴ 60

@hamza_karakurt-17704

Last seen 2.9 years ago

Turkey

Hello everyone, I am trying to use ASpli for alternative splicing analysis. I have mouse data. I downloaded genome fasta and gtf file from Ensembl (GRCh39). I used hisat2 for alignment via the code:

hisat2 -p 12 -x GRCm39_ht_indexed -1 SRR6376403_1.fastq -2 SRR6376403_2.fastq -S SRR6376403.sam

and I sorted and indexed the SAM via samtools and merged the BAM files of each condition to a single BAM file.

Code should be placed in three backticks as shown below

library(ASpli)
library(GenomicFeatures)

genomeTxDb <- makeTxDbFromGFF("GRCm39.105.gtf")
features <- binGenome(genomeTxDb)

load("GRCm38_gtf_data.RData")

BAMFiles <- list.files(path="/home/huk/Documents/R/tjob_mm/BAM_Files" , pattern = ".bam" , full.names = TRUE)

targets <- data.frame(row.names = paste0('Sample',c(4:10)),
                      bam = BAMFiles[c(3,5,7,9,11,13,15)],
                      f1 = c('CBG','CBG','CBG','CBG','WBG','WBG','WBG'),
                      stringsAsFactors = FALSE)

mBAMs <- data.frame(bam = BAMFiles[c(1,27)],
                    condition = c("CBG","WBG"))


gbcounts <- gbCounts(features=features, targets=targets,
                     minReadLength = 30, maxISize = 50000 , libType = "PE")


asd <- jCounts(counts=gbcounts, features=features, minReadLength=30 , libType = "PE")

gb <- gbDUreport(gbcounts, contrast = c(-1,1))

jdur <- jDUreport(asd, contrast=c(-1,1))

sr <- splicingReport(gb, jdur, counts=gbcounts)

is <- integrateSignals(sr,asd)

exportIntegratedSignals(is,sr=sr,
                        output.dir = "mm_adipose",
                        counts=gbcounts,features=features,asd=asd,
                        mergedBams = mBAMs)

gbCounts function gives these warning but not an error:

Warning messages:
1: In .make_GAlignmentPairs_from_GAlignments(gal, strandMode = strandMode,  ... :
    11329 alignments with ambiguous pairing were dumped.
    Use 'getDumpedAlignments()' to retrieve them from the dump environment.
2: In FUN(X[[i]], ...) :
  Some seqnames had a '.' present in their names. ASpli had to normalize them using '_'.
3: In .make_GAlignmentPairs_from_GAlignments(gal, strandMode = strandMode,  ... :
    14593 alignments with ambiguous pairing were dumped.
    Use 'getDumpedAlignments()' to retrieve them from the dump environment.
4: In FUN(X[[i]], ...) :
  Some seqnames had a '.' present in their names. ASpli had to normalize them using '_'.
5: In .make_GAlignmentPairs_from_GAlignments(gal, strandMode = strandMode,  ... :
    13303 alignments with ambiguous pairing were dumped.
    Use 'getDumpedAlignments()' to retrieve them from the dump environment.
6: In FUN(X[[i]], ...) :
  Some seqnames had a '.' present in their names. ASpli had to normalize them using '_'.
7: In .make_GAlignmentPairs_from_GAlignments(gal, strandMode = strandMode,  ... :
    18998 alignments with ambiguous pairing were dumped.
    Use 'getDumpedAlignments()' to retrieve them from the dump environment.
8: In FUN(X[[i]], ...) :
  Some seqnames had a '.' present in their names. ASpli had to normalize them using '_'.
9: In .make_GAlignmentPairs_from_GAlignments(gal, strandMode = strandMode,  ... :
    5345 alignments with ambiguous pairing were dumped.
    Use 'getDumpedAlignments()' to retrieve them from the dump environment.
10: In FUN(X[[i]], ...) :
  Some seqnames had a '.' present in their names. ASpli had to normalize them using '_'.
11: In .make_GAlignmentPairs_from_GAlignments(gal, strandMode = strandMode,  ... :
    2538 alignments with ambiguous pairing were dumped.
    Use 'getDumpedAlignments()' to retrieve them from the dump environment.
12: In FUN(X[[i]], ...) :
  Some seqnames had a '.' present in their names. ASpli had to normalize them using '_'.
13: In .make_GAlignmentPairs_from_GAlignments(gal, strandMode = strandMode,  ... :
    13423 alignments with ambiguous pairing were dumped.
    Use 'getDumpedAlignments()' to retrieve them from the dump environment.
14: In FUN(X[[i]], ...) :
  Some seqnames had a '.' present in their names. ASpli had to normalize them using '_'.

The error message I had in jCounts step:

Junctions PJU completed
Error in .local(x, use.names, use.mcols, ...) : 
  For some pairs in 'x', the 2 alignments are not on the same chromosome. Cannot associate a
  unique genomic range to such pairs. Please call granges() with
  'on.discordant.seqnames="drop"' to drop these pairs, or with 'on.discordant.seqnames="split"'
  to represent each of them with 2 genomic ranges in the returned GRanges object. Note that in
  both cases the returned object won't be parallel to 'x'. Alternatively, please consider using
  grglist() instead of granges() to turn 'x' into a GRangesList object instead of a GRanges
  object. See ?GAlignmentPairs for more information.
In addition: Warning messages:
1: In .make_GAlignmentPairs_from_GAlignments(gal, strandMode = strandMode,  :
    11329 alignments with ambiguous pairing were dumped.
    Use 'getDumpedAlignments()' to retrieve them from the dump environment.
2: In FUN(X[[i]], ...) :
  Some seqnames had a '.' present in their names. ASpli had to normalize them using '_'.

When I change the parameter, libType to 'SE' it works but I am not sure it is the right thing to do. If someone gives me a hint, I would be so grateful. As I can see error suggests using 'on.discordant.seqnames="drop"' on granges() function but I am not also sure which step includes this parameter.

R version 3.6.3 (2020-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.3 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=tr_TR.UTF-8       
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=tr_TR.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=tr_TR.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=tr_TR.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] GenomicFeatures_1.38.2 GenomicRanges_1.38.0   GenomeInfoDb_1.22.1    ASpli_2.5.0           
 [5] AnnotationDbi_1.48.0   IRanges_2.20.2         S4Vectors_0.24.4       Biobase_2.46.0        
 [9] BiocGenerics_0.32.0    edgeR_3.28.1           limma_3.42.2          

loaded via a namespace (and not attached):
  [1] colorspace_2.0-2            ellipsis_0.3.2              biovizBase_1.34.1          
  [4] htmlTable_2.4.0             XVector_0.26.0              base64enc_0.1-3            
  [7] dichromat_2.0-0             rstudioapi_0.13             DT_0.20                    
 [10] bit64_4.0.5                 fansi_1.0.2                 splines_3.6.3              
 [13] cachem_1.0.6                knitr_1.37                  Formula_1.2-4              
 [16] Rsamtools_2.2.3             cluster_2.1.2               dbplyr_2.1.1               
 [19] png_0.1-7                   BiocManager_1.30.16         compiler_3.6.3             
 [22] httr_1.4.2                  backports_1.4.1             assertthat_0.2.1           
 [25] Matrix_1.4-0                fastmap_1.1.0               lazyeval_0.2.2             
 [28] cli_3.1.1                   htmltools_0.5.2             prettyunits_1.1.1          
 [31] tools_3.6.3                 igraph_1.2.11               gtable_0.3.0               
 [34] glue_1.6.1                  GenomeInfoDbData_1.2.2      dplyr_1.0.8                
 [37] rappdirs_0.3.3              Rcpp_1.0.8                  vctrs_0.3.8                
 [40] Biostrings_2.54.0           rtracklayer_1.46.0          xfun_0.29                  
 [43] stringr_1.4.0               lifecycle_1.0.1             ensembldb_2.10.2           
 [46] XML_3.99-0.3                zlibbioc_1.32.0             MASS_7.3-55                
 [49] scales_1.1.1                BSgenome_1.54.0             BiocStyle_2.14.4           
 [52] VariantAnnotation_1.32.0    ProtGenerics_1.18.0         hms_1.1.1                  
 [55] SummarizedExperiment_1.16.1 AnnotationFilter_1.10.0     RColorBrewer_1.1-2         
 [58] yaml_2.2.2                  curl_4.3.2                  memoise_2.0.1              
 [61] gridExtra_2.3               ggplot2_3.3.5               UpSetR_1.4.0               
 [64] biomaRt_2.42.1              rpart_4.1.16                latticeExtra_0.6-29        
 [67] stringi_1.7.6               RSQLite_2.2.9               checkmate_2.0.0            
 [70] BiocParallel_1.20.1         rlang_1.0.1                 pkgconfig_2.0.3            
 [73] matrixStats_0.61.0          bitops_1.0-7                evaluate_0.14              
 [76] lattice_0.20-45             purrr_0.3.4                 GenomicAlignments_1.22.1   
 [79] htmlwidgets_1.5.4           bit_4.0.4                   tidyselect_1.1.1           
 [82] plyr_1.8.6                  magrittr_2.0.2              R6_2.5.1                   
 [85] generics_0.1.2              Hmisc_4.6-0                 DelayedArray_0.12.3        
 [88] DBI_1.1.2                   pillar_1.7.0                foreign_0.8-76             
 [91] survival_3.2-13             RCurl_1.98-1.5              nnet_7.3-17                
 [94] tibble_3.1.6                crayon_1.4.2                utf8_1.2.2                 
 [97] BiocFileCache_1.10.2        rmarkdown_2.11              jpeg_0.1-9                 
[100] progress_1.2.2              locfit_1.5-9.4              grid_3.6.3                 
[103] data.table_1.14.2           blob_1.2.2                  digest_0.6.29              
[106] pbmcapply_1.5.0             tidyr_1.2.0                 openssl_1.4.6              
[109] munsell_0.5.0               Gviz_1.30.3                 askpass_1.1

Thank you in advance

ASpli alterna AlternativeSplicing • 898 views

ADD COMMENT • link 3.4 years ago hamza_karakurt ▴ 60