Running ASpli on R results in "Error: vector memory exhausted (limit reached?)"
0
0
Entering edit mode
arphatak • 0
@arphatak-16061
Last seen 4.9 years ago

Hello! I am trying to look at alternative splicing using ASpli package. Below is the workflow and session info. I get an error stating "Error: vector memory exhausted (limit reached?)". Initially I tried to run this with 18 files and assumed it was the reason for error, but this problem persisted even when running with 8 files.  RNAseq was performed using Illumina TruSeq stranded library prep, libraries are 75bp paired-end run. Not sure what I am doing wrong (R/Bioconductor novice!!). Kindly advise.

Workflow:

source("https://bioconductor.org/biocLite.R")
biocLite(c("Rsubread", "edgeR", "limma", "Glimma","GenomicFeatures", "AnnotationDbi","org.Hs.eg.db","biomaRt","ASpli"))
library(GenomicFeatures)
dir()
getwd()
setwd("/Users/AP/Desktop/R_trials_6518")
file.path <- ("/Users/AP/Desktop/R_trials_6518")
gtffile <- file.path("Homo_sapiens.GRCh38.92.chr.gtf")
txdb2 <- makeTxDbFromGFF(gtffile, format="gtf")
library(edgeR)
library(ASpli)
features2 <- binGenome(txdb2)
geneCoord2 <- featuresg(features2)
binCoord2 <- featuresb(features2)
junctionCoord2 <- featuresj(features2)
binMetadata2 <- mcols(binCoord2)
bamFiles2 <- c("1.hg38.sorted.bam", "2.hg38.sorted.bam", "3.hg38.sorted.bam", "4.hg38.sorted.bam", "5.hg38.sorted.bam", "6.hg38.sorted.bam", "7.hg38.sorted.bam", "8.hg38.sorted.bam")
targets2 <- data.frame(row.names = c("1","2","3","4","5","6","7","8"),
                       bam = bamFiles2,
                       genotype = c("CONTROL","CONTROL","CONTROL","CONTROL","Dx","Dx","Dx","Dx"),
                       stringsAsFactors = FALSE )
targets2
getConditions(targets2)
bam <- loadBAM(targets2)
counts2 <- readCounts(features2,
                      bam,
                      targets,
                      readLength = 70L,
                      maxISize = 50000,
                      minAnchor = 7)

Output

> getwd()

[1] "/Users/AP/Desktop/R_trials_6518"

> setwd("/Users/AP/Desktop/R_trials_6518")

> file.path <- ("/Users/AP/Desktop/R_trials_6518")

> gtffile <- file.path("Homo_sapiens.GRCh38.92.chr.gtf")

> txdb2 <- makeTxDbFromGFF(gtffile, format="gtf")

Import genomic features from the file as a GRanges object ... OK

Prepare the 'metadata' data frame ... OK

Make the TxDb object ... OK

Warning message:

In .get_cds_IDX(type, phase) :

  The "phase" metadata column contains non-NA values for features of type stop_codon. This information was ignored.

> library(edgeR)

> library(ASpli)

> features2 <- binGenome(txdb2)

* Number of extracted Genes = 58336

* Number of extracted Exon Bins = 588989

* Number of extracted intron bins = 480619

* Number of extracted trascripts = 203675

* Number of extracted junctions = 356962

* Number of AS bins (not include external) = 82674

* Number of AS bins (include external) = 84450

* Classified as: 

ES bins = 29231 (35%)

IR bins = 12028 (15%)

Alt5'ss bins = 9177 (11%)

Alt3'ss bins = 9802 (12%)

Multiple AS bins = 22436 (27%)

classified as:

ES bins = 6998 (31%)

IR bins = 4673 (21%)

Alt5'ss bins = 4324 (19%)

Alt3'ss bins = 5363 (24%)

 

> geneCoord2 <- featuresg(features2)

> binCoord2 <- featuresb(features2)

> junctionCoord2 <- featuresj(features2)

> binMetadata2 <- mcols(binCoord2)

> bamFiles2 <- c("1.sorted.bam", "2.hg38.sorted.bam", "3.hg38.sorted.bam", "4.hg38.sorted.bam", "5.hg38.sorted.bam", "6.hg38.sorted.bam", "7.hg38.sorted.bam", "8.hg38.sorted.bam")

> targets2 <- data.frame(row.names = c("1","2","3","4","5","6","7","8"),

+                        bam = bamFiles2,

+                        genotype = c("CONTROL","CONTROL","CONTROL","CONTROL","Dx","Dx","Dx","Dx"),

+                        stringsAsFactors = FALSE )

> targets2

                                    bam genotype

1             1.hg38.sorted.bam  CONTROL

2             2.hg38.sorted.bam  CONTROL

3             3.hg38.sorted.bam  CONTROL

4             4.hg38.sorted.bam  CONTROL

5             5.hg38.sorted.bam      Dx

6             6.hg38.sorted.bam      Dx

7             7.hg38.sorted.bam      Dx

8             8.hg38.sorted.bam      Dx

> getConditions(targets2)

[1] "CONTROL" "Dx"    

> bam <- loadBAM(targets2)

> counts2 <- readCounts(features2,

+                       bam,

+                       targets,

+                       readLength = 70L,

+                       maxISize = 50000,

+                       minAnchor = 7)

Read summarization by gene completed

Read summarization by bin completed

Read summarization by ei1 region completed

Read summarization by ie2 region completed

Error: vector memory exhausted (limit reached?)

In addition: There were 32 warnings (use warnings() to see them)

> GeneCounts <- countsg(counts2)
Error in countsg(counts2) : object 'counts2' not found

##all 32 warnings state:

1: In .Seqinfo.mergexy(x, y) :
  The 2 combined objects have no sequence levels in common. (Use
  suppressWarnings() to suppress this warning.)

> sessionInfo()
R version 3.5.0 (2018-04-23)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.5

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] ASpli_1.6.0            edgeR_3.22.2           limma_3.36.1           GenomicFeatures_1.32.0 AnnotationDbi_1.42.1   Biobase_2.40.0         GenomicRanges_1.32.3   GenomeInfoDb_1.16.0   
 [9] IRanges_2.14.10        S4Vectors_0.18.2       BiocGenerics_0.26.0    BiocInstaller_1.30.0  

loaded via a namespace (and not attached):
 [1] httr_1.3.1                  bit64_0.9-7                 splines_3.5.0               Formula_1.2-3               assertthat_0.2.0            latticeExtra_0.6-28        
 [7] blob_1.1.1                  BSgenome_1.48.0             GenomeInfoDbData_1.1.0      Rsamtools_1.32.0            yaml_2.1.19                 progress_1.1.2             
[13] pillar_1.2.3                RSQLite_2.1.1               backports_1.1.2             lattice_0.20-35             biovizBase_1.28.0           digest_0.6.15              
[19] checkmate_1.8.5             RColorBrewer_1.1-2          XVector_0.20.0              colorspace_1.3-2            htmltools_0.3.6             Matrix_1.2-14              
[25] plyr_1.8.4                  XML_3.98-1.11               pkgconfig_2.0.1             biomaRt_2.36.1              zlibbioc_1.26.0             scales_0.5.0               
[31] BiocParallel_1.14.1         htmlTable_1.12              tibble_1.4.2                AnnotationFilter_1.4.0      ggplot2_2.2.1               SummarizedExperiment_1.10.1
[37] nnet_7.3-12                 Gviz_1.24.0                 lazyeval_0.2.1              survival_2.42-3             magrittr_1.5                memoise_1.1.0              
[43] evaluate_0.10.1             foreign_0.8-70              data.table_1.11.4           tools_3.5.0                 prettyunits_1.0.2           BiocStyle_2.8.2            
[49] matrixStats_0.53.1          stringr_1.3.1               munsell_0.4.3               locfit_1.5-9.1              cluster_2.0.7-1             DelayedArray_0.6.0         
[55] ensembldb_2.4.1             Biostrings_2.48.0           compiler_3.5.0              rlang_0.2.1                 grid_3.5.0                  RCurl_1.95-4.10            
[61] dichromat_2.0-0             rstudioapi_0.7              VariantAnnotation_1.26.0    htmlwidgets_1.2             bitops_1.0-6                base64enc_0.1-3            
[67] rmarkdown_1.9               gtable_0.2.0                curl_3.2                    DBI_1.0.0                   R6_2.2.2                    GenomicAlignments_1.16.0   
[73] gridExtra_2.3               knitr_1.20                  rtracklayer_1.40.3          bit_1.1-14                  Hmisc_4.1-1                 rprojroot_1.3-2            
[79] ProtGenerics_1.12.0         stringi_1.2.2               Rcpp_0.12.17                rpart_4.1-13                acepack_1.4.1   

Many thanks! Looking forward to suggestions/advise.

 

Best regards,
Amruta

 

aspli rstudio • 1.2k views
ADD COMMENT

Login before adding your answer.

Traffic: 480 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6