BSmooth: mclapply error when smoothing using BSmooth
2
0
Entering edit mode
zuupcat • 0
@zuupcat-12187
Last seen 7.3 years ago

Hey everybody,

I'm running into an error when trying to smooth my BSseq object using BSmooth(). Strangely enough the error does not occur when I reduce the number of CpG loci to 165428 or below (yes exactly this number..). Even stranger is, that I cannot reproduce the error using the BS.chr22 cancer dataset, even though the number of CpG loci here is more than 490k. Is this a bug? Do you have an idea how to work around it?

Below I also included the function I use to generate the bsseq object from a list of .bed files, in case this is relevant. The only difference I could detect, when comparing the cancer dataset to my object, is that the class of most attributes of the cancer dataset is "Formal class 'DataFrame' [package "IRanges"] with x slots", whereas my objects have the class "Formal class 'DataFrame' [package "S4Vectors"] with x slots]". 

bed2bsseq <- function(bed.list) { 
n <- length(bed.list) 
bed <- bed.list[[1]] 
samplenames <- paste("r", 1:n,sep="") 
pdata <- data.frame(Rep = paste("replicate",1:n,sep=""), row.names = samplenames) 
gr <- GRanges(seqnames=bed$chr, ranges=IRanges(bed$start, bed$start), strand=bed$strand)
met.list <- sapply(bed.list, "[", c(6)) 
met <- do.call(cbind, met.list) 
cov.list <- sapply(bed.list, "[", c(5)) 
cov <- do.call(cbind, cov.list) 
bs <- BSseq(M = met, Cov = cov, gr = gr, sampleNames = samplenames, pData = pdata) 
return(bs) 
} 
> bs <- bed2bsseq(bed.list) 
> bs 
An object of type 'BSseq' with 1165998 methylation loci 4 samples has not been smoothed 
> bs.smooth <- BSmooth(bs) 
[BSmooth] preprocessing ... done in 1.0 sec 
[BSmooth] smoothing by 'sample' 
(mc.cores = 1, mc.preschedule = FALSE) 
Error in BSmooth(bs) : 
BSmooth encountered smoothing errors 
In addition: 
Warning message: 
In mclapply(seq(along = sampleNames), function(sIdx) { : 4 function calls resulted in an error

 

Hopefully somebody can help!

Best regards, zuupcat

> sessionInfo() R version 3.3.2 (2016-10-31) 
Platform: x86_64-apple-darwin13.4.0 (64-bit) 
Running under: macOS Sierra 10.12.2 locale: 
[1] de_DE.UTF-8/de_DE.UTF-8/de_DE.UTF-8/C/de_DE.UTF-8/de_DE.UTF-8 attached base packages: 
[1] stats4 parallel stats graphics grDevices utils datasets methods base other attached packages: 
[1] bsseq_1.8.2 limma_3.28.21 SummarizedExperiment_1.2.3 Biobase_2.32.0 
[5] GenomicRanges_1.24.3 GenomeInfoDb_1.8.7 IRanges_2.6.1 S4Vectors_0.10.3 
[9] BiocGenerics_0.18.0 
loaded via a namespace (and not attached): 
[1] Rcpp_0.12.7 XVector_0.12.1 zlibbioc_1.18.0 munsell_0.4.3 colorspace_1.2-7 
[6] lattice_0.20-34 plyr_1.8.4 tools_3.3.2 grid_3.3.2 data.table_1.9.6 
[11] R.oo_1.21.0 gtools_3.5.0 matrixStats_0.51.0 permute_0.9-4 R.utils_2.4.0 
[16] scales_0.4.0 R.methodsS3_1.7.1 locfit_1.5-9.1 chron_2.3-47 

bsseq bsmooth • 1.3k views
ADD COMMENT
1
Entering edit mode
zuupcat • 0
@zuupcat-12187
Last seen 7.3 years ago

I was able to solve the problem. Apparently there was something wrong with the transition from my list of bed files to the bsseq object, resulting in a messed up ordering of the CpG positions. This resulted in a function abort when calling all(diff(pos) > 0). I don't really know what the problem is, as the ordering of the positions in the GRanges object prior to creating the bsseq object is still fine. I circumvented the issues by eliminating the explicit calling of IRanges in the preparation of the bsseq object and leaving everything to the internal processing. Attached you can find my modified function. 

bed2bsseq2 <- function(bed.list)
{
  n <- length(bed.list)
  bed <- bed.list[[1]]
  
  samplenames <- paste("r", 1:n,sep="")
  pdata <- data.frame(Rep = paste("replicate",1:n,sep=""),
                      row.names = samplenames)
  
  met.list <- met.list <- sapply(bed.list, "[", c(6))
  met <- do.call(cbind, met.list)
  
  cov.list <- sapply(bed.list, "[", c(5))
  cov <- do.call(cbind, cov.list)

  bs <- BSseq(chr = bed$chr, pos = bed$start, M = met, Cov = cov, sampleNames = samplenames, pData = pdata)
}
ADD COMMENT
0
Entering edit mode
@kasper-daniel-hansen-2979
Last seen 10 months ago
United States
That is weird. How many samples do you have? On Tue, Jan 17, 2017 at 12:02 PM, zuupcat [bioc] <noreply@bioconductor.org> wrote: > Activity on a post you are following on support.bioconductor.org > > User zuupcat <https: support.bioconductor.org="" u="" 12187=""/> wrote Question: > BSmooth: mclapply error when smoothing using BSmooth > <https: support.bioconductor.org="" p="" 91293=""/>: > > Hey everybody, > > I'm running into an error when trying to smooth my BSseq object using > BSmooth(). The error is not reproducible and strangely enough also does not > occur when I reduce the number of CpG loci to 165428 or below (yes exactly > this number..). Even stranger is, that I cannot reproduce the error using > the BS.chr22 cancer dataset, even though the number of CpG loci here is > more than 490k. Is this a bug? Do you have an idea how to work around it? > > Below I also included the function I use to generate the bsseq object from > a list of .bed files, in case this is relevant. The only difference I could > detect, when comparing the cancer dataset to my object, is that the class > of most attributes of the cancer dataset is "Formal class 'DataFrame' > [package "IRanges"] with x slots", whereas my objects have the class > "Formal class 'DataFrame' [package "S4Vectors"] with x slots]". > > Hopefully somebody can help! > > Best regards, zuupcat > > > sessionInfo() R version 3.3.2 (2016-10-31) Platform: > x86_64-apple-darwin13.4.0 (64-bit) Running under: macOS Sierra 10.12.2 > locale: [1] de_DE.UTF-8/de_DE.UTF-8/de_DE.UTF-8/C/de_DE.UTF-8/de_DE.UTF-8 > attached base packages: [1] stats4 parallel stats graphics grDevices utils > datasets methods base other attached packages: [1] bsseq_1.8.2 > limma_3.28.21 SummarizedExperiment_1.2.3 Biobase_2.32.0 [5] > GenomicRanges_1.24.3 GenomeInfoDb_1.8.7 IRanges_2.6.1 S4Vectors_0.10.3 [9] > BiocGenerics_0.18.0 loaded via a namespace (and not attached): [1] > Rcpp_0.12.7 XVector_0.12.1 zlibbioc_1.18.0 munsell_0.4.3 colorspace_1.2-7 > [6] lattice_0.20-34 plyr_1.8.4 tools_3.3.2 grid_3.3.2 data.table_1.9.6 [11] > R.oo_1.21.0 gtools_3.5.0 matrixStats_0.51.0 permute_0.9-4 R.utils_2.4.0 > [16] scales_0.4.0 R.methodsS3_1.7.1 locfit_1.5-9.1 chron_2.3-47 bed2bsseq > <- function(bed.list) { n <- length(bed.list) bed <- bed.list[[1]] > samplenames <- paste("r", 1:n,sep="") pdata <- data.frame(Rep = > paste("replicate",1:n,sep=""), row.names = samplenames) gr <- > GRanges(seqnames=bed$chr, ranges=IRanges(bed$start, bed$start), > strand=bed$strand) met.list <- sapply(bed.list, "[", c(6)) met <- > do.call(cbind, met.list) cov.list <- sapply(bed.list, "[", c(5)) cov <- > do.call(cbind, cov.list) bs <- BSseq(M = met, Cov = cov, gr = gr, > sampleNames = samplenames, pData = pdata) return(bs) } > bs <- > bed2bsseq(bed.list) > bs An object of type 'BSseq' with 1165998 methylation > loci 4 samples has not been smoothed > > bs.smooth <- BSmooth(bs) [BSmooth] > preprocessing ... done in 1.1 sec [BSmooth] smoothing by 'sample' (mc.cores > = 1, mc.preschedule = FALSE) Error in BSmooth(bs) : BSmooth encountered > smoothing errors In addition: Warning message: In mclapply(seq(along = > sampleNames), function(sIdx) { : 4 function calls resulted in an error > > ------------------------------ > > Post tags: bsseq, bsmooth > > You may reply via email or visit BSmooth: mclapply error when smoothing using BSmooth >
ADD COMMENT
0
Entering edit mode

4 samples. Reducing the number of samples however also does not help (at least not when subsetting the bsseq object directly). 

Edit: Or is there a way to get more information about this error? I tried to manually load the BSmooth function and run it line by line, but unfortunately I also run into problems.

ADD REPLY

Login before adding your answer.

Traffic: 748 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6