Search
Question: How to handle BSsmooth.tstat for single replicate samples? Error: need at least two non-NA values to interpolate
0
gravatar for unmeshj
2.8 years ago by
unmeshj0
United States
unmeshj0 wrote:

Hello, I am working with Bsseq to find DMRs in single replicate samples (Control and Expt) from WGBS. The bsseq object creation and smoothing worked well. Although, I encounter errors in the BSmooth.tstat function.

BS.OBJ.tstat = BSmooth.tstat(BS.OBJ.keep, group1="Control", group2="Expt", estimate.var = "paired", local.correct = TRUE, verbose = TRUE)

Error: length(group1) + length(group2) >= 3 is not TRUE

OR sometimes the error is  

[BSmooth.tstat] preprocessing ... done in 2.5 sec
[BSmooth.tstat] computing stats within groups ... done in 0.4 sec
[BSmooth.tstat] computing stats across groups ... Error in approxfun(xx, yy) : 
  need at least two non-NA values to interpolate

I see that one of the errors has been mentioned in one other post here, although it does not answer the question of how to resolve the problem.

Am I using the command right? I am having hard time understanding if bsseq has options to handle single replicate data. If it can, which steps/options require changes? The documentation does not mention any special options for single replicate data and many times we prefer to add the data from replicates into single files per sample. 

Thank you very much in advance.

Best regards,

UJ

 

> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] bsseq_0.10.0         matrixStats_0.14.0   GenomicRanges_1.14.4 XVector_0.2.0        IRanges_1.20.7       BiocGenerics_0.8.0  

loaded via a namespace (and not attached):
 [1] Biobase_2.22.0   colorspace_1.2-4 grid_3.0.2       lattice_0.20-23  locfit_1.5-9.1   munsell_0.4.2    plyr_1.8.1       Rcpp_0.11.4      scales_0.2.4    
[10] stats4_3.0.2     tools_3.0.2  

 

 

 

 

 

 

 

 

 

ADD COMMENTlink modified 2.8 years ago • written 2.8 years ago by unmeshj0
0
gravatar for Kasper Daniel Hansen
2.8 years ago by
United States
Kasper Daniel Hansen6.3k wrote:
What exactly do you mean by "single-replicate data". If you're comparing two groups and there is only a single sample in each group, you're out of luck; you cannot use the t-stat approach in BSmooth (although you can still use the smoothing functionality). Instead you could use the fisher exact test which we also have in bsseq, but which does not handle biological variation. The first error you see has to do with checking that you have enough samples in the two groups. The second error, which has been reported many times, typically happens when you include a very small chromosome (small = few CpGs), like chrMT. I suggest removing those first. Kasper On Tue, Mar 3, 2015 at 9:12 AM, unmeshj [bioc] <noreply@bioconductor.org> wrote: > Activity on a post you are following on support.bioconductor.org > > User unmeshj <https: support.bioconductor.org="" u="" 7420=""/> wrote Question: > How to handle BSsmooth.tstat for single replicate samples? Error: need at > least two non-NA values to interpolate > <https: support.bioconductor.org="" p="" 65352=""/>: > > Hello, I am working with Bsseq to find DMRs in single replicate samples > (Control and Expt) from WGBS. The bsseq object creation and smoothing > worked well. Although, I encounter errors in the BSmooth.tstat function. > > BS.OBJ.tstat = BSmooth.tstat(BS.OBJ.keep, group1="Control", group2="Expt", > estimate.var = "paired", local.correct = TRUE, verbose = TRUE) > > Error: length(group1) + length(group2) >= 3 is not TRUE > > *OR sometimes the error is * > > [BSmooth.tstat] preprocessing ... done in 2.5 sec > [BSmooth.tstat] computing stats within groups ... done in 0.4 sec > [BSmooth.tstat] computing stats across groups ... Error in approxfun(xx, > yy) : > need at least two non-NA values to interpolate > > I see that one of the errors has been mentioned in one other post here, > although it does not answer the question of how to resolve the problem. > > Am I using the command right? I am having hard time understanding if bsseq > has options to handle single replicate data. If it can, which steps/options > require changes? The documentation does not mention any special options for > single replicate data and many times we prefer to add the data from > replicates into single files per sample. > > Thank you very much in advance. > > Best regards, > > UJ > > > > > sessionInfo() > R version 3.0.2 (2013-09-25) > Platform: x86_64-w64-mingw32/x64 (64-bit) > > locale: > [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United > States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C > > [5] LC_TIME=English_United States.1252 > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods > base > > other attached packages: > [1] bsseq_0.10.0 matrixStats_0.14.0 GenomicRanges_1.14.4 > XVector_0.2.0 IRanges_1.20.7 BiocGenerics_0.8.0 > > loaded via a namespace (and not attached): > [1] Biobase_2.22.0 colorspace_1.2-4 grid_3.0.2 lattice_0.20-23 > locfit_1.5-9.1 munsell_0.4.2 plyr_1.8.1 Rcpp_0.11.4 > scales_0.2.4 > [10] stats4_3.0.2 tools_3.0.2 > > > > > > > > > > > > > > > > > > > > ------------------------------ > > You may reply via email or visit How to handle BSsmooth.tstat for single replicate samples? Error: need at least two non-NA values to interpolate >
ADD COMMENTlink written 2.8 years ago by Kasper Daniel Hansen6.3k
0
gravatar for unmeshj
2.8 years ago by
unmeshj0
United States
unmeshj0 wrote:

Hello Kasper, 

Thank you very much for your quick reply. 

By single-replicate-data, I mean I have only one data file for each of the 'Control'  and for 'Expt'. So, as you mention in your reply, I can not use the t-stat approach. I used the fisher test and got the results that give the pValue (unadjusted) for each of the CpG. I have two specific questions:

1) When you say the Fisher test cannot model biological variability, do you mean we cannot use any functions within bsseq to identify DMRs?

2) Can you suggest any approach/program to identify DMRs on such single replicate data (using the Fisher test results or otherwise), as identifying DMRs as opposed to individual CpGs with methylation change is big strength of Bsmooth/Bsseq. 

Also, I was able to use the Bsmooth objects for plotting and it is really useful. I see good smoothing and good representative differences in the plotted Bsmooth data at the known differential regions that I plotted. 

Thanks again,

UJ

 

 

 

 

ADD COMMENTlink written 2.8 years ago by unmeshj0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 153 users visited in the last hour