Search
Question: Error using Bsmooth.tstat due to NAs
0
gravatar for Guest User
3.2 years ago by
Guest User12k
Guest User12k wrote:
Dear all, I am trying to use bsseq to analyze WGBS data and identify DMRs following drug treatment. I have a BSseq object consisting of 2 samples (treated and ctrl) that has been smoothed: >smooth An object of type 'BSseq' with 38250590 methylation loci 2 samples has been smoothed with BSmooth (ns = 50, h = 500, maxGap = 100000000) When trying to run BSmooth.tstat, I am encountering the following error due to NAs: >smooth=BSmooth.tstat(smooth, group1="Ctrl", group2="Treated", estimate.var="paired", verbose=TRUE, local.correct=TRUE) preprocessing ... done in 76.1 sec computing stats within groups ... done in 11.9 sec computing stats across groups ... Error in approxfun(xx, yy) : need at least two non-NA values to interpolate Timing stopped at: 7.994 1.649 9.64 However, when I checked in my methylation and coverage matrix, I didn't see any NAs contained in my data, so I am not sure why I am getting this error. > summary(getMeth(smooth)) Ctrl Treated Min. :0.0000 Min. :0.0000 1st Qu.:0.6064 1st Qu.:0.3391 Median :0.8402 Median :0.4816 Mean :0.7131 Mean :0.4365 3rd Qu.:0.9006 3rd Qu.:0.5600 Max. :1.0000 Max. :1.0000 I would appreciate any suggestions or advice. Thank you very much, Fides -- output of sessionInfo(): R version 3.0.1 (2013-05-16) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] C attached base packages: [1] parallel stats graphics grDevices utils datasets methods [8] base other attached packages: [1] bsseqData_0.1.3 bsseq_0.8.0 matrixStats_0.8.14 [4] GenomicRanges_1.12.5 IRanges_1.18.4 BiocGenerics_0.6.0 [7] plyr_1.8 loaded via a namespace (and not attached): [1] Biobase_2.20.1 R.methodsS3_1.6.1 RColorBrewer_1.0-5 colorspace_1.2-4 [5] dichromat_2.0-0 grid_3.0.1 labeling_0.2 lattice_0.20-24 [9] locfit_1.5-9.1 munsell_0.4.2 scales_0.2.3 stats4_3.0.1 [13] stringr_0.6.2 tools_3.0.1 zlibbioc_1.6.0 -- Sent via the guest posting facility at bioconductor.org.
ADD COMMENTlink modified 2.7 years ago by parker0 • written 3.2 years ago by Guest User12k
0
gravatar for Kasper Daniel Hansen
3.2 years ago by
United States
Kasper Daniel Hansen6.3k wrote:
This is hard to know for sure without knowing much more about the data. My guess is that you have some contigs (chromosomes) which are super small and they cause problems. You can also try setting verbose to say 2 or 3 and see if that helps narrow down where in the function it happens. On Wed, Sep 10, 2014 at 12:18 PM, Fides Lay [guest] <guest at="" bioconductor.org=""> wrote: > Dear all, > > I am trying to use bsseq to analyze WGBS data and identify DMRs following > drug treatment. I have a BSseq object consisting of 2 samples (treated and > ctrl) that has been smoothed: > > >smooth > An object of type 'BSseq' with > 38250590 methylation loci > 2 samples > has been smoothed with > BSmooth (ns = 50, h = 500, maxGap = 100000000) > > When trying to run BSmooth.tstat, I am encountering the following error > due to NAs: > >smooth=BSmooth.tstat(smooth, group1="Ctrl", group2="Treated", > estimate.var="paired", verbose=TRUE, local.correct=TRUE) > preprocessing ... done in 76.1 sec > computing stats within groups ... done in 11.9 sec > computing stats across groups ... Error in approxfun(xx, yy) : > need at least two non-NA values to interpolate > Timing stopped at: 7.994 1.649 9.64 > > However, when I checked in my methylation and coverage matrix, I didn't > see any NAs contained in my data, so I am not sure why I am getting this > error. > > > summary(getMeth(smooth)) > Ctrl Treated > Min. :0.0000 Min. :0.0000 > 1st Qu.:0.6064 1st Qu.:0.3391 > Median :0.8402 Median :0.4816 > Mean :0.7131 Mean :0.4365 > 3rd Qu.:0.9006 3rd Qu.:0.5600 > Max. :1.0000 Max. :1.0000 > > I would appreciate any suggestions or advice. > > Thank you very much, > Fides > > -- output of sessionInfo(): > > R version 3.0.1 (2013-05-16) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] C > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] bsseqData_0.1.3 bsseq_0.8.0 matrixStats_0.8.14 > [4] GenomicRanges_1.12.5 IRanges_1.18.4 BiocGenerics_0.6.0 > [7] plyr_1.8 > > loaded via a namespace (and not attached): > [1] Biobase_2.20.1 R.methodsS3_1.6.1 RColorBrewer_1.0-5 > colorspace_1.2-4 > [5] dichromat_2.0-0 grid_3.0.1 labeling_0.2 > lattice_0.20-24 > [9] locfit_1.5-9.1 munsell_0.4.2 scales_0.2.3 stats4_3.0.1 > [13] stringr_0.6.2 tools_3.0.1 zlibbioc_1.6.0 > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENTlink written 3.2 years ago by Kasper Daniel Hansen6.3k
As Tim Triche pointed out off-list: what you're doing does not make sense when you only have 1 sample in each group. I was clearly reading the report too fast. On Sun, Sep 14, 2014 at 1:07 PM, Kasper Daniel Hansen <khansen at="" jhsph.edu=""> wrote: > This is hard to know for sure without knowing much more about the data. > > My guess is that you have some contigs (chromosomes) which are super small > and they cause problems. You can also try setting verbose to say 2 or 3 > and see if that helps narrow down where in the function it happens. > > On Wed, Sep 10, 2014 at 12:18 PM, Fides Lay [guest] < > guest at bioconductor.org> wrote: > >> Dear all, >> >> I am trying to use bsseq to analyze WGBS data and identify DMRs following >> drug treatment. I have a BSseq object consisting of 2 samples (treated and >> ctrl) that has been smoothed: >> >> >smooth >> An object of type 'BSseq' with >> 38250590 methylation loci >> 2 samples >> has been smoothed with >> BSmooth (ns = 50, h = 500, maxGap = 100000000) >> >> When trying to run BSmooth.tstat, I am encountering the following error >> due to NAs: >> >smooth=BSmooth.tstat(smooth, group1="Ctrl", group2="Treated", >> estimate.var="paired", verbose=TRUE, local.correct=TRUE) >> preprocessing ... done in 76.1 sec >> computing stats within groups ... done in 11.9 sec >> computing stats across groups ... Error in approxfun(xx, yy) : >> need at least two non-NA values to interpolate >> Timing stopped at: 7.994 1.649 9.64 >> >> However, when I checked in my methylation and coverage matrix, I didn't >> see any NAs contained in my data, so I am not sure why I am getting this >> error. >> >> > summary(getMeth(smooth)) >> Ctrl Treated >> Min. :0.0000 Min. :0.0000 >> 1st Qu.:0.6064 1st Qu.:0.3391 >> Median :0.8402 Median :0.4816 >> Mean :0.7131 Mean :0.4365 >> 3rd Qu.:0.9006 3rd Qu.:0.5600 >> Max. :1.0000 Max. :1.0000 >> >> I would appreciate any suggestions or advice. >> >> Thank you very much, >> Fides >> >> -- output of sessionInfo(): >> >> R version 3.0.1 (2013-05-16) >> Platform: x86_64-unknown-linux-gnu (64-bit) >> >> locale: >> [1] C >> >> attached base packages: >> [1] parallel stats graphics grDevices utils datasets methods >> [8] base >> >> other attached packages: >> [1] bsseqData_0.1.3 bsseq_0.8.0 matrixStats_0.8.14 >> [4] GenomicRanges_1.12.5 IRanges_1.18.4 BiocGenerics_0.6.0 >> [7] plyr_1.8 >> >> loaded via a namespace (and not attached): >> [1] Biobase_2.20.1 R.methodsS3_1.6.1 RColorBrewer_1.0-5 >> colorspace_1.2-4 >> [5] dichromat_2.0-0 grid_3.0.1 labeling_0.2 >> lattice_0.20-24 >> [9] locfit_1.5-9.1 munsell_0.4.2 scales_0.2.3 stats4_3.0.1 >> [13] stringr_0.6.2 tools_3.0.1 zlibbioc_1.6.0 >> >> -- >> Sent via the guest posting facility at bioconductor.org. >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > [[alternative HTML version deleted]]
ADD REPLYlink written 3.2 years ago by Kasper Daniel Hansen6.3k
0
gravatar for parker
2.7 years ago by
parker0
Switzerland
parker0 wrote:

I am also having the same problem - but when I get the information on the smoothed data: 

summary(getMeth(bsseq.data.smoothed))

I see that there are quite a few NAs ~700 for most of my samples. I don't know whether this is because I did a targeted approach and many of the CpGs were not targeted. Is there somehow I can remove those CpGs which have not been covered by the targeted approach?

Many thanks in advance for your help!

Hannah

ADD COMMENTlink written 2.7 years ago by parker0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 259 users visited in the last hour