Hello! I am trying to look at alternative splicing using ASpli package. Below is the workflow and session info. I get an error stating "Error: vector memory exhausted (limit reached?)". Initially I tried to run this with 18 files and assumed it was the reason for error, but this problem persisted even when running with 8 files. RNAseq was performed using Illumina TruSeq stranded library prep, libraries are 75bp paired-end run. Not sure what I am doing wrong (R/Bioconductor novice!!). Kindly advise.
Workflow:
source("https://bioconductor.org/biocLite.R")
biocLite(c("Rsubread", "edgeR", "limma", "Glimma","GenomicFeatures", "AnnotationDbi","org.Hs.eg.db","biomaRt","ASpli"))
library(GenomicFeatures)
dir()
getwd()
setwd("/Users/AP/Desktop/R_trials_6518")
file.path <- ("/Users/AP/Desktop/R_trials_6518")
gtffile <- file.path("Homo_sapiens.GRCh38.92.chr.gtf")
txdb2 <- makeTxDbFromGFF(gtffile, format="gtf")
library(edgeR)
library(ASpli)
features2 <- binGenome(txdb2)
geneCoord2 <- featuresg(features2)
binCoord2 <- featuresb(features2)
junctionCoord2 <- featuresj(features2)
binMetadata2 <- mcols(binCoord2)
bamFiles2 <- c("1.hg38.sorted.bam", "2.hg38.sorted.bam", "3.hg38.sorted.bam", "4.hg38.sorted.bam", "5.hg38.sorted.bam", "6.hg38.sorted.bam", "7.hg38.sorted.bam", "8.hg38.sorted.bam")
targets2 <- data.frame(row.names = c("1","2","3","4","5","6","7","8"),
bam = bamFiles2,
genotype = c("CONTROL","CONTROL","CONTROL","CONTROL","Dx","Dx","Dx","Dx"),
stringsAsFactors = FALSE )
targets2
getConditions(targets2)
bam <- loadBAM(targets2)
counts2 <- readCounts(features2,
bam,
targets,
readLength = 70L,
maxISize = 50000,
minAnchor = 7)
Output
> getwd()
[1] "/Users/AP/Desktop/R_trials_6518"
> setwd("/Users/AP/Desktop/R_trials_6518")
> file.path <- ("/Users/AP/Desktop/R_trials_6518")
> gtffile <- file.path("Homo_sapiens.GRCh38.92.chr.gtf")
> txdb2 <- makeTxDbFromGFF(gtffile, format="gtf")
Import genomic features from the file as a GRanges object ... OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... OK
Warning message:
In .get_cds_IDX(type, phase) :
The "phase" metadata column contains non-NA values for features of type stop_codon. This information was ignored.
> library(edgeR)
> library(ASpli)
> features2 <- binGenome(txdb2)
* Number of extracted Genes = 58336
* Number of extracted Exon Bins = 588989
* Number of extracted intron bins = 480619
* Number of extracted trascripts = 203675
* Number of extracted junctions = 356962
* Number of AS bins (not include external) = 82674
* Number of AS bins (include external) = 84450
* Classified as:
ES bins = 29231 (35%)
IR bins = 12028 (15%)
Alt5'ss bins = 9177 (11%)
Alt3'ss bins = 9802 (12%)
Multiple AS bins = 22436 (27%)
classified as:
ES bins = 6998 (31%)
IR bins = 4673 (21%)
Alt5'ss bins = 4324 (19%)
Alt3'ss bins = 5363 (24%)
> geneCoord2 <- featuresg(features2)
> binCoord2 <- featuresb(features2)
> junctionCoord2 <- featuresj(features2)
> binMetadata2 <- mcols(binCoord2)
> bamFiles2 <- c("1.sorted.bam", "2.hg38.sorted.bam", "3.hg38.sorted.bam", "4.hg38.sorted.bam", "5.hg38.sorted.bam", "6.hg38.sorted.bam", "7.hg38.sorted.bam", "8.hg38.sorted.bam")
> targets2 <- data.frame(row.names = c("1","2","3","4","5","6","7","8"),
+ bam = bamFiles2,
+ genotype = c("CONTROL","CONTROL","CONTROL","CONTROL","Dx","Dx","Dx","Dx"),
+ stringsAsFactors = FALSE )
> targets2
bam genotype
1 1.hg38.sorted.bam CONTROL
2 2.hg38.sorted.bam CONTROL
3 3.hg38.sorted.bam CONTROL
4 4.hg38.sorted.bam CONTROL
5 5.hg38.sorted.bam Dx
6 6.hg38.sorted.bam Dx
7 7.hg38.sorted.bam Dx
8 8.hg38.sorted.bam Dx
> getConditions(targets2)
[1] "CONTROL" "Dx"
> bam <- loadBAM(targets2)
> counts2 <- readCounts(features2,
+ bam,
+ targets,
+ readLength = 70L,
+ maxISize = 50000,
+ minAnchor = 7)
Read summarization by gene completed
Read summarization by bin completed
Read summarization by ei1 region completed
Read summarization by ie2 region completed
Error: vector memory exhausted (limit reached?)
In addition: There were 32 warnings (use warnings() to see them)
> GeneCounts <- countsg(counts2)
Error in countsg(counts2) : object 'counts2' not found
##all 32 warnings state:
1: In .Seqinfo.mergexy(x, y) :
The 2 combined objects have no sequence levels in common. (Use
suppressWarnings() to suppress this warning.)
> sessionInfo()
R version 3.5.0 (2018-04-23)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.5
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] ASpli_1.6.0 edgeR_3.22.2 limma_3.36.1 GenomicFeatures_1.32.0 AnnotationDbi_1.42.1 Biobase_2.40.0 GenomicRanges_1.32.3 GenomeInfoDb_1.16.0
[9] IRanges_2.14.10 S4Vectors_0.18.2 BiocGenerics_0.26.0 BiocInstaller_1.30.0
loaded via a namespace (and not attached):
[1] httr_1.3.1 bit64_0.9-7 splines_3.5.0 Formula_1.2-3 assertthat_0.2.0 latticeExtra_0.6-28
[7] blob_1.1.1 BSgenome_1.48.0 GenomeInfoDbData_1.1.0 Rsamtools_1.32.0 yaml_2.1.19 progress_1.1.2
[13] pillar_1.2.3 RSQLite_2.1.1 backports_1.1.2 lattice_0.20-35 biovizBase_1.28.0 digest_0.6.15
[19] checkmate_1.8.5 RColorBrewer_1.1-2 XVector_0.20.0 colorspace_1.3-2 htmltools_0.3.6 Matrix_1.2-14
[25] plyr_1.8.4 XML_3.98-1.11 pkgconfig_2.0.1 biomaRt_2.36.1 zlibbioc_1.26.0 scales_0.5.0
[31] BiocParallel_1.14.1 htmlTable_1.12 tibble_1.4.2 AnnotationFilter_1.4.0 ggplot2_2.2.1 SummarizedExperiment_1.10.1
[37] nnet_7.3-12 Gviz_1.24.0 lazyeval_0.2.1 survival_2.42-3 magrittr_1.5 memoise_1.1.0
[43] evaluate_0.10.1 foreign_0.8-70 data.table_1.11.4 tools_3.5.0 prettyunits_1.0.2 BiocStyle_2.8.2
[49] matrixStats_0.53.1 stringr_1.3.1 munsell_0.4.3 locfit_1.5-9.1 cluster_2.0.7-1 DelayedArray_0.6.0
[55] ensembldb_2.4.1 Biostrings_2.48.0 compiler_3.5.0 rlang_0.2.1 grid_3.5.0 RCurl_1.95-4.10
[61] dichromat_2.0-0 rstudioapi_0.7 VariantAnnotation_1.26.0 htmlwidgets_1.2 bitops_1.0-6 base64enc_0.1-3
[67] rmarkdown_1.9 gtable_0.2.0 curl_3.2 DBI_1.0.0 R6_2.2.2 GenomicAlignments_1.16.0
[73] gridExtra_2.3 knitr_1.20 rtracklayer_1.40.3 bit_1.1-14 Hmisc_4.1-1 rprojroot_1.3-2
[79] ProtGenerics_1.12.0 stringi_1.2.2 Rcpp_0.12.17 rpart_4.1-13 acepack_1.4.1
Many thanks! Looking forward to suggestions/advise.
Best regards,
Amruta