Hello, I am working with Bsseq to find DMRs in single replicate samples (Control and Expt) from WGBS. The bsseq object creation and smoothing worked well. Although, I encounter errors in the BSmooth.tstat function.
BS.OBJ.tstat = BSmooth.tstat(BS.OBJ.keep, group1="Control", group2="Expt", estimate.var = "paired", local.correct = TRUE, verbose = TRUE)
Error: length(group1) + length(group2) >= 3 is not TRUE
OR sometimes the error is
[BSmooth.tstat] preprocessing ... done in 2.5 sec
[BSmooth.tstat] computing stats within groups ... done in 0.4 sec
[BSmooth.tstat] computing stats across groups ... Error in approxfun(xx, yy) :
need at least two non-NA values to interpolate
I see that one of the errors has been mentioned in one other post here, although it does not answer the question of how to resolve the problem.
Am I using the command right? I am having hard time understanding if bsseq has options to handle single replicate data. If it can, which steps/options require changes? The documentation does not mention any special options for single replicate data and many times we prefer to add the data from replicates into single files per sample.
Thank you very much in advance.
Best regards,
UJ
> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] bsseq_0.10.0 matrixStats_0.14.0 GenomicRanges_1.14.4 XVector_0.2.0 IRanges_1.20.7 BiocGenerics_0.8.0
loaded via a namespace (and not attached):
[1] Biobase_2.22.0 colorspace_1.2-4 grid_3.0.2 lattice_0.20-23 locfit_1.5-9.1 munsell_0.4.2 plyr_1.8.1 Rcpp_0.11.4 scales_0.2.4
[10] stats4_3.0.2 tools_3.0.2
I had the same problem which I was struggling for days. This lovely error:
Moreover sometimes it worked, however there was info about this error in all rows in column with adjusted stat. Hence dmr analysis was impossible.
As the autohor said, chromosomes are too small to perform approxfun namely not enough CpG per chromosome (as I understood). So how to fix it? First take a look about your chromosomes:
In this dataframe you'll see the length of particular chr. If lengths is 1-3 you'll recieve this approxfun error. You have to remove such chromosomes from you .bedGraphs (from bismark or methydackel) before reading the data, before this step:
You can do this manually, however I recommend to write some simple script in python or bash. After that when trefoil chromosomes are removed, read data again and repeat the analysis.
BSmooth.tstat
should work fine.