I'm recently dealing with a set of MeDIP-seq data with MEDIPS but got an error with "MEDIPS.creatSet". Could anyone with experience help me? Thank you.
My raw paired end reads were mapped to UCSC_mm10 reference genome with BWA. Then I dealt with the data with following codes:
library(MEDIPS) library(BSgenome.Mmusculus.UCSC.mm10) BSgenome="BSgenome.Mmusculus.UCSC.mm10" chr.select=c(“chr1”,"chr2","chr3","chr4","chr5","chr6","chr7","chr8","chr9","chr10","chr11","chr12","chr13","chr14","chr15","chr16","chr17","chr18","chr19","chrX","chrY") extend=0 shift=0 ws=300 uniq=1 ME_Input_2 = MEDIPS.createSet(file = "/home/aligned_bam/LSK-ME-I_sorted.bam", BSgenome = BSgenome, extend = extend, shift = shift, uniq = uniq, window_size = ws, chr.select = chr.select, paired=T, bwa=T)
But I got the following output and errors:
Warning: processing of bwa alignment files for paired-end sequencing data is still under development. Reading bam alignment LSK-ME-I_sorted.bam considering chr1 chr2 chr3 chr4 chr5 chr6 chr7 chr8 chr9 chr10 chr11 chr12 chr13 chr14 chr15 chr16 chr17 chr18 chr19 chrX chrY using bam index Total number of imported first mate reads in properly mapped pairs: 35648142 scanBamFlag: isPaired = T, isProperPair=TRUE , hasUnmappedMate=FALSE, isUnmappedQuery = F, isFirstMateRead = T, isSecondMateRead = F Mean insertion size: 1916.52 nt SD of the insertion size: 360506 nt Max insertion size: 189455410 nt Min insertion size: 0 nt Paired-end alignment (BAM) files generated by bwa are processed in a different way compared to BAM files generated by bowtie, because in the bwa output the first mate can be either the 'left' or the 'right' mate regardless of their alignment to the plus or the minus strand. Creating GRange Object... Keep at most 1 read mapping to the same genomic location. Number of remaining short reads: 32803074 Calculating genomic coordinates... Creating Granges object for genome wide windows... Calculating short read coverage at genome wide windows... Error in queryHits(findOverlaps(query, subject, maxgap = maxgap, minoverlap = minoverlap, : error in evaluating the argument 'x' in selecting a method for function 'queryHits': Error in .Call2("Integer_order2", a, b, decreasing, PACKAGE = "S4Vectors") : long vectors not supported yet: memory.c:3361 Calls: findOverlaps ... findOverlaps -> .local -> <Anonymous> -> .Call2 -> .Call Calls: MEDIPS.createSet ... countOverlaps -> countOverlaps -> .local -> queryHits Execution halted
Here comes my sessionInfo. MEDIPS is in version 1.20 however, Bioconductor is still in version 3.0.
> sessionInfo() R version 3.2.2 (2015-08-14) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu precise (12.04.5 LTS) locale:  LC_CTYPE=en_HK.UTF-8 LC_NUMERIC=C  LC_TIME=en_HK.UTF-8 LC_COLLATE=en_HK.UTF-8  LC_MONETARY=en_HK.UTF-8 LC_MESSAGES=en_HK.UTF-8  LC_PAPER=en_HK.UTF-8 LC_NAME=C  LC_ADDRESS=C LC_TELEPHONE=C  LC_MEASUREMENT=en_HK.UTF-8 LC_IDENTIFICATION=C attached base packages:  stats4 parallel stats graphics grDevices utils datasets  methods base other attached packages:  MEDIPS_1.20.0 Rsamtools_1.18.3 BSgenome_1.34.1  rtracklayer_1.26.3 Biostrings_2.34.1 XVector_0.6.0  GenomicRanges_1.18.4 GenomeInfoDb_1.2.5 IRanges_2.0.1  S4Vectors_0.4.0 BiocGenerics_0.12.1 loaded via a namespace (and not attached):  AnnotationDbi_1.28.2 DNAcopy_1.40.0 magrittr_1.5  zlibbioc_1.12.0 GenomicAlignments_1.2.2 BiocParallel_1.0.3  brew_1.0-6 foreach_1.4.3 stringr_1.0.0  sendmailR_1.2-1 tools_3.2.2 fail_1.3  Biobase_2.26.0 checkmate_1.6.2 DBI_0.3.1  gtools_3.5.0 iterators_1.0.8 BatchJobs_1.6  digest_0.6.8 preprocessCore_1.28.0 base64enc_0.1-3  bitops_1.0-6 codetools_0.2-14 biomaRt_2.22.0  RCurl_1.95-4.7 RSQLite_1.0.0 stringi_0.5-5  BBmisc_1.9 XML_3.98-1.3
I have no idea on the errors and can't find answer myself. :( Any help is appreciated.