Simulation of the beta average
1
0
Entering edit mode
Samor Gandhi ▴ 40
@samor-gandhi-4278
Last seen 8.1 years ago
Hello, I would like to run a simulation study on the beta average of the Beta Studio illumin. I did something like x <- c(runif(200,0,0.1), runif(600,0.11,0.89),runif(200,0.9,1)); hist(x)! Would this be a reasonable simulation? Many thanks for any help, Samor [[alternative HTML version deleted]]
• 667 views
0
Entering edit mode
@sean-davis-490
Last seen 5 weeks ago
United States
On Tue, Sep 28, 2010 at 4:45 AM, Samor Gandhi <samorgandhi@yahoo.com> wrote: > Hello, > > I would like to run a simulation study on the beta average of the Beta > Studio illumin. I did something like x <- c(runif(200,0,0.1), > runif(600,0.11,0.89),runif(200,0.9,1)); hist(x)! Would this be a reasonable > simulation? > > Hi, Samor. I think the answer to your question is going to depend on what you mean by "reasonable" and that will depend on the question you are actually trying to answer. What is the question you are trying to answer? Are you trying to perform hypothesis testing or algorithm development, or something else? Sean [[alternative HTML version deleted]]
0
Entering edit mode
Hi Sean, I am trying to compute the power of the SAMr for t-test and Wilcoxon test! Regards, Samor --- On Tue, 28/9/10, Sean Davis <sdavis2@mail.nih.gov> wrote: From: Sean Davis <sdavis2@mail.nih.gov> Subject: Re: [BioC] Simulation of the beta average To: "Samor Gandhi" <samorgandhi@yahoo.com> Cc: bioconductor@stat.math.ethz.ch Date: Tuesday, 28 September, 2010, 14:31 On Tue, Sep 28, 2010 at 4:45 AM, Samor Gandhi <samorgandhi@yahoo.com> wrote: Hello, I would like to run a simulation study on the beta average of the Beta Studio illumin. I did something like x <- c(runif(200,0,0.1), runif(600,0.11,0.89),runif(200,0.9,1)); hist(x)! Would this be a reasonable simulation? Hi, Samor. I think the answer to your question is going to depend on what you mean by "reasonable" and that will depend on the question you are actually trying to answer.  What is the question you are trying to answer?  Are you trying to perform hypothesis testing or algorithm development, or something else? Sean [[alternative HTML version deleted]]
0
Entering edit mode
On Tue, Sep 28, 2010 at 7:15 AM, Samor Gandhi <samorgandhi@yahoo.com> wrote: > Hi Sean, > > I am trying to compute the power of the SAMr for t-test and Wilcoxon test! > > Hi, Samor. You might need to look into the data a bit more and perhaps review a bit about power calculation before proceeding further. There are a few steps missing, I think. What you posted doesn't constitute a simulation that could be used for power calculations unless I misunderstand what you are trying to do. Sean > Regards, > Samor > > --- On *Tue, 28/9/10, Sean Davis <sdavis2@mail.nih.gov>* wrote: > > > From: Sean Davis <sdavis2@mail.nih.gov> > Subject: Re: [BioC] Simulation of the beta average > To: "Samor Gandhi" <samorgandhi@yahoo.com> > Cc: bioconductor@stat.math.ethz.ch > Date: Tuesday, 28 September, 2010, 14:31 > > > > > On Tue, Sep 28, 2010 at 4:45 AM, Samor Gandhi <samorgandhi@yahoo.com<http: mc="" compose?to="samorgandhi@yahoo.com"> > > wrote: > > Hello, > > I would like to run a simulation study on the beta average of the Beta > Studio illumin. I did something like x <- c(runif(200,0,0.1), > runif(600,0.11,0.89),runif(200,0.9,1)); hist(x)! Would this be a reasonable > simulation? > > > Hi, Samor. > > I think the answer to your question is going to depend on what you mean by > "reasonable" and that will depend on the question you are actually trying to > answer. What is the question you are trying to answer? Are you trying to > perform hypothesis testing or algorithm development, or something else? > > Sean > > > > [[alternative HTML version deleted]]
0
Entering edit mode
Hi Sean, Thank you for your reply. What I posted would be for one sample, but I need to have let say 20 samples into two groups, that the first group would have different mean and standard deviation than the second group; and that was my problem how to generate these. I did something like S <- 10000 n <- 10 data1 <- c(runif(n*S*0.2,0,0.05), runif(n*S*0.6,0.06,0.95),runif(n*S*0.2,0.96,1)) data2 <- c(runif(n*S*0.2,0.06,0.1), runif(n*S*0.6,0.11,0.8),runif(n*S*0.2,0.81,0.9)) group1 <- matrix(data1, nrow = S,ncol = n) group2 <- matrix(data2, nrow = S,ncol = n) wt <- rep(NA, S) for (i in 1:S) { wt[i] <- wilcox.test(group1[i, ], group2[i, ], alternative = "two.sided")$p.value } Wsize <- sum(wt < 0.05)/S Wsize I very thankful for any help. Samor --- On Tue, 28/9/10, Sean Davis <sdavis2@mail.nih.gov> wrote: From: Sean Davis <sdavis2@mail.nih.gov> Subject: Re: [BioC] Simulation of the beta average To: "Samor Gandhi" <samorgandhi@yahoo.com> Cc: "bioconductor@stat.math.ethz.ch" <bioconductor@stat.math.ethz.ch> Date: Tuesday, 28 September, 2010, 17:46 On Tue, Sep 28, 2010 at 7:15 AM, Samor Gandhi <samorgandhi@yahoo.com> wrote: Hi Sean, I am trying to compute the power of the SAMr for t-test and Wilcoxon test! Hi, Samor. You might need to look into the data a bit more and perhaps review a bit about power calculation before proceeding further. There are a few steps missing, I think. What you posted doesn't constitute a simulation that could be used for power calculations unless I misunderstand what you are trying to do. Sean Regards, Samor --- On Tue, 28/9/10, Sean Davis <sdavis2@mail.nih.gov> wrote: From: Sean Davis <sdavis2@mail.nih.gov> Subject: Re: [BioC] Simulation of the beta average To: "Samor Gandhi" <samorgandhi@yahoo.com> Cc: bioconductor@stat.math.ethz.ch Date: Tuesday, 28 September, 2010, 14:31 On Tue, Sep 28, 2010 at 4:45 AM, Samor Gandhi <samorgandhi@yahoo.com> wrote: Hello, I would like to run a simulation study on the beta average of the Beta Studio illumin. I did something like x <- c(runif(200,0,0.1), runif(600,0.11,0.89),runif(200,0.9,1)); hist(x)! Would this be a reasonable simulation? Hi, Samor. I think the answer to your question is going to depend on what you mean by "reasonable" and that will depend on the question you are actually trying to answer. What is the question you are trying to answer? Are you trying to perform hypothesis testing or algorithm development, or something else? Sean [[alternative HTML version deleted]] ADD REPLY 0 Entering edit mode Hi Samor, Try estimating some parameters from whatever datasets you wish to test and then simulating thusly. For example, using the method of moments. Assume you have n1 cases and n0 controls with measurements at p loci, here's some code that I pulled out of my... well, here's some code. # this is some seriously fugly code, I should just paste mine # but that's even fuglier due to the simulation constraints params <- data.frame( mu1 <- rowMeans(your.case.betas, na.rm=TRUE), mu0 <- rowMeans(your.control.betas, na.rm=TRUE), s2.1 <- rowVars(your.case.betas, na.rm=TRUE), s2.0 <- rowVars(your.control.betas, na.rm=TRUE) ) params$phi1 <- ((params$mu1 - (params$mu1**2))/params$s2.1)-1, params$phi0 <- ((params$mu0 - (params$mu0**2))/params$s2.0)-1 with(params, { params$a1 <- mu1*phi1 params$b1 <- (1-mu1)*phi1 params$a0 <- mu0*phi0 params$b0 <- (1-mu0)*phi0 }) # I actually use a function betaff() for all of this betas <- matrix(0, ncol=n1+n0, nrow=p) for(p in 1:p) betas[,p] <- with(params[,p],c(rbeta(n1, a1, b1),rbeta(n0, a0, b0))) # yeah yeah, I should probably have used sapply() I'll spoil the surprise, though -- the Mann-Whitney test is the most consistent for unbalanced n1 and n0, and a model-based likelihood- based test beats the hell out of both of them at reasonable sizes. In between and striking a decent balance of power and FDR (with permutation) is the T test. For whatever reason, the Welch test basically always sucks until n > huge, at which point likelihood-based tests are already much more powerful. I can get a Wald test to behave down to about 10 vs. 10, for example, which blew me away. Anyhow, I'll publish this soon enough. Until then, feel free to use permGPU to control your FDR in small samples :-) Also look into the 'betareg' package if you wish, although there are kicks and stings there, too. Again, I'll publish some fixes soonishly. Too many people have helped me out not to do so. Sean -- I have a bunch of patches for methylumi to do background correction, some normalization (not yet as nice as I would like, but at least it works), bead-level data import, and nicer QC plots -- let me know if you want to try and merge them for the next release. Thanks for the great package. --tim On Tue, Sep 28, 2010 at 6:23 AM, Samor Gandhi <samorgandhi@yahoo.com> wrote: > Hi Sean, > > Thank you for your reply. What I posted would be for one sample, but I need > to have let say 20 samples into two groups, that the first group would have > different mean and standard deviation than the second group; and that was my > problem how to generate these. > > I did something like > > S <- 10000 > n <- 10 > data1 <- c(runif(n*S*0.2,0,0.05), > runif(n*S*0.6,0.06,0.95),runif(n*S*0.2,0.96,1)) > data2 <- c(runif(n*S*0.2,0.06,0.1), > runif(n*S*0.6,0.11,0.8),runif(n*S*0.2,0.81,0.9)) > > group1 <- matrix(data1, nrow = S,ncol = n) > group2 <- matrix(data2, nrow = S,ncol = n) > > wt <- rep(NA, S) > for (i in 1:S) { > wt[i] <- wilcox.test(group1[i, ], group2[i, ], alternative = > "two.sided")$p.value } > Wsize <- sum(wt < 0.05)/S > Wsize > > I very thankful for any help. > > Samor > > --- On Tue, 28/9/10, Sean Davis <sdavis2@mail.nih.gov> wrote: > > From: Sean Davis <sdavis2@mail.nih.gov> > Subject: Re: [BioC] Simulation of the beta average > To: "Samor Gandhi" <samorgandhi@yahoo.com> > Cc: "bioconductor@stat.math.ethz.ch" <bioconductor@stat.math.ethz.ch> > Date: Tuesday, 28 September, 2010, 17:46 > > > > On Tue, Sep 28, 2010 at 7:15 AM, Samor Gandhi <samorgandhi@yahoo.com> > wrote: > > Hi Sean, > > I am trying to compute the power of the SAMr for t-test and Wilcoxon test! > > > > Hi, Samor. > You might need to look into the data a bit more and perhaps review a bit > about power calculation before proceeding further. There are a few steps > missing, I think. What you posted doesn't constitute a simulation that > could be used for power calculations unless I misunderstand what you are > trying to do. > > Sean > > Regards, > Samor > > --- On Tue, 28/9/10, Sean Davis <sdavis2@mail.nih.gov> wrote: > > > From: Sean Davis <sdavis2@mail.nih.gov> > Subject: Re: [BioC] Simulation of the beta average > To: "Samor Gandhi" <samorgandhi@yahoo.com> > > Cc: bioconductor@stat.math.ethz.ch > Date: Tuesday, 28 September, 2010, 14:31 > > > > On Tue, Sep 28, 2010 at 4:45 AM, Samor Gandhi <samorgandhi@yahoo.com> > wrote: > > > Hello, > > > > I would like to run a simulation study on the beta average of the Beta > Studio illumin. I did something like x <- c(runif(200,0,0.1), > runif(600,0.11,0.89),runif(200,0.9,1)); hist(x)! Would this be a reasonable > simulation? > > > > > Hi, Samor. > I think the answer to your question is going to depend on what you mean by > "reasonable" and that will depend on the question you are actually trying to > answer. What is the question you are trying to answer? Are you trying to > perform hypothesis testing or algorithm development, or something else? > > > Sean > > > > > > > > > > > > [[alternative HTML version deleted]] > > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > -- With four parameters I can fit an elephant, and with five I can make him wiggle his trunk. John von Neumann [[alternative HTML version deleted]]
0
Entering edit mode
Hi Tim, Many thanks! Regards, Samor --- On Tue, 28/9/10, Tim Triche <tim.triche@gmail.com> wrote: From: Tim Triche <tim.triche@gmail.com> Subject: Re: [BioC] Simulation of the beta average To: "Samor Gandhi" <samorgandhi@yahoo.com> Cc: "Sean Davis" <sdavis2@mail.nih.gov>, "bioconductor@stat.math.ethz.ch" <bioconductor@stat.math.ethz.ch> Date: Tuesday, 28 September, 2010, 22:42 Hi Samor, Try estimating some parameters from whatever datasets you wish to test and then simulating thusly.  For example, using the method of moments. Assume you have n1 cases and n0 controls with measurements at p loci, here's some code that I pulled out of my... well, here's some code. # this is some seriously fugly code, I should just paste mine# but that's even fuglier due to the simulation constraints params <- data.frame(  mu1 <- rowMeans(your.case.betas, na.rm=TRUE),   mu0 <- rowMeans(your.control.betas, na.rm=TRUE),   s2.1 <- rowVars(your.case.betas, na.rm=TRUE),  s2.0 <- rowVars(your.control.betas, na.rm=TRUE) )params$phi1 <- ((params$mu1 - (params$mu1**2))/params$s2.1)-1, params$phi0 <- ((params$mu0 - (params$mu0**2))/params$s2.0)-1 with(params, {  params$a1 <- mu1*phi1 params$b1 <- (1-mu1)*phi1  params$a0 <- mu0*phi0 params$b0 <- (1-mu0)*phi0}) # I actually use a function betaff() for all of this betas <- matrix(0, ncol=n1+n0, nrow=p)for(p in 1:p)   betas[,p] <- with(params[,p],c(rbeta(n1, a1, b1),rbeta(n0, a0, b0))) # yeah yeah, I should probably have used sapply() I'll spoil the surprise, though -- the Mann-Whitney test is the most consistent for unbalanced n1 and n0, and a model-based likelihood- based test beats the hell out of both of them at reasonable sizes. In between and striking a decent balance of power and FDR (with permutation) is the T test.  For whatever reason, the Welch test basically always sucks until n > huge, at which point likelihood-based tests are already much more powerful.  I can get a Wald test to behave down to about 10 vs. 10, for example, which blew me away.  Anyhow, I'll publish this soon enough.  Until then, feel free to use permGPU to control your FDR in small samples :-) Also look into the 'betareg' package if you wish, although there are kicks and stings there, too.  Again, I'll publish some fixes soonishly.  Too many people have helped me out not to do so. Sean -- I have a bunch of patches for methylumi to do background correction, some normalization (not yet as nice as I would like, but at least it works), bead-level data import, and nicer QC plots -- let me know if you want to try and merge them for the next release. Thanks for the great package. --tim On Tue, Sep 28, 2010 at 6:23 AM, Samor Gandhi <samorgandhi@yahoo.com> wrote: Hi Sean, Thank you for your reply. What I posted would be for one sample, but I need to have let say 20 samples into two groups, that the first group would have different mean and standard deviation than the second group; and that was my problem how to generate these. I did something like S <- 10000  n <- 10  data1 <- c(runif(n*S*0.2,0,0.05), runif(n*S*0.6,0.06,0.95),runif(n*S*0.2,0.96,1))  data2 <- c(runif(n*S*0.2,0.06,0.1), runif(n*S*0.6,0.11,0.8),runif(n*S*0.2,0.81,0.9))  group1 <- matrix(data1, nrow = S,ncol = n)  group2 <- matrix(data2, nrow = S,ncol = n)  wt <- rep(NA, S)  for (i in 1:S) {  wt[i] <- wilcox.test(group1[i, ], group2[i, ], alternative = "two.sided")$p.value } Wsize <- sum(wt < 0.05)/S Wsize I very thankful for any help. Samor --- On Tue, 28/9/10, Sean Davis <sdavis2@mail.nih.gov> wrote: From: Sean Davis <sdavis2@mail.nih.gov> Subject: Re: [BioC] Simulation of the beta average To: "Samor Gandhi" <samorgandhi@yahoo.com> Cc: "bioconductor@stat.math.ethz.ch" <bioconductor@stat.math.ethz.ch> Date: Tuesday, 28 September, 2010, 17:46 On Tue, Sep 28, 2010 at 7:15 AM, Samor Gandhi <samorgandhi@yahoo.com> wrote: Hi Sean, I am trying to compute the power of the SAMr for t-test and Wilcoxon test! Hi, Samor. You might need to look into the data a bit more and perhaps review a bit about power calculation before proceeding further. There are a few steps missing, I think. What you posted doesn't constitute a simulation that could be used for power calculations unless I misunderstand what you are trying to do. Sean Regards, Samor --- On Tue, 28/9/10, Sean Davis <sdavis2@mail.nih.gov> wrote: From: Sean Davis <sdavis2@mail.nih.gov> Subject: Re: [BioC] Simulation of the beta average To: "Samor Gandhi" <samorgandhi@yahoo.com> Cc: bioconductor@stat.math.ethz.ch Date: Tuesday, 28 September, 2010, 14:31 On Tue, Sep 28, 2010 at 4:45 AM, Samor Gandhi <samorgandhi@yahoo.com> wrote: Hello, I would like to run a simulation study on the beta average of the Beta Studio illumin. I did something like x <- c(runif(200,0,0.1), runif(600,0.11,0.89),runif(200,0.9,1)); hist(x)! Would this be a reasonable simulation? Hi, Samor. I think the answer to your question is going to depend on what you mean by "reasonable" and that will depend on the question you are actually trying to answer. What is the question you are trying to answer? Are you trying to perform hypothesis testing or algorithm development, or something else? Sean [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor@stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- With four parameters I can fit an elephant, and with five I can make him wiggle his trunk. John von Neumann [[alternative HTML version deleted]] ADD REPLY 0 Entering edit mode Why can you generate simulation data by bootstrapping your existing data? ...Tao ----- Original Message ---- > From: Samor Gandhi <samorgandhi at="" yahoo.com=""> > To: Sean Davis <sdavis2 at="" mail.nih.gov=""> > Cc: "bioconductor at stat.math.ethz.ch" <bioconductor at="" stat.math.ethz.ch=""> > Sent: Tue, September 28, 2010 6:23:56 AM > Subject: Re: [BioC] Simulation of the beta average > > Hi Sean, > > Thank you for your reply. What I posted would be for one sample, but I need to >have let say 20 samples into two groups, that the first group would have >different mean and standard deviation than the second group; and that was my >problem how to generate these. > > I did something like > > S <- 10000 > n <- 10 > data1 <- c(runif(n*S*0.2,0,0.05), >runif(n*S*0.6,0.06,0.95),runif(n*S*0.2,0.96,1)) > data2 <- c(runif(n*S*0.2,0.06,0.1), >runif(n*S*0.6,0.11,0.8),runif(n*S*0.2,0.81,0.9)) > > group1 <- matrix(data1, nrow = S,ncol = n) > group2 <- matrix(data2, nrow = S,ncol = n) > > wt <- rep(NA, S) > for (i in 1:S) { > wt[i] <- wilcox.test(group1[i, ], group2[i, ], alternative = >"two.sided")$p.value } > Wsize <- sum(wt < 0.05)/S > Wsize > > I very thankful for any help. > > Samor > > --- On Tue, 28/9/10, Sean Davis <sdavis2 at="" mail.nih.gov=""> wrote: > > From: Sean Davis <sdavis2 at="" mail.nih.gov=""> > Subject: Re: [BioC] Simulation of the beta average > To: "Samor Gandhi" <samorgandhi at="" yahoo.com=""> > Cc: "bioconductor at stat.math.ethz.ch" <bioconductor at="" stat.math.ethz.ch=""> > Date: Tuesday, 28 September, 2010, 17:46 > > > > On Tue, Sep 28, 2010 at 7:15 AM, Samor Gandhi <samorgandhi at="" yahoo.com=""> wrote: > > Hi Sean, > > I am trying to compute the power of the SAMr for t-test and Wilcoxon test! > > > > Hi, Samor. > You might need to look into the data a bit more and perhaps review a bit about >power calculation before proceeding further. There are a few steps missing, I >think. What you posted doesn't constitute a simulation that could be used for >power calculations unless I misunderstand what you are trying to do. > > Sean > > Regards, > Samor > > --- On Tue, 28/9/10, Sean Davis <sdavis2 at="" mail.nih.gov=""> wrote: > > > From: Sean Davis <sdavis2 at="" mail.nih.gov=""> > Subject: Re: [BioC] Simulation of the beta average > To: "Samor Gandhi" <samorgandhi at="" yahoo.com=""> > > Cc: bioconductor at stat.math.ethz.ch > Date: Tuesday, 28 September, 2010, 14:31 > > > > On Tue, Sep 28, 2010 at 4:45 AM, Samor Gandhi <samorgandhi at="" yahoo.com=""> wrote: > > > Hello, > > > > I would like to run a simulation study on the beta average of the Beta Studio >illumin. I did something like x <- c(runif(200,0,0.1), >runif(600,0.11,0.89),runif(200,0.9,1)); hist(x)! Would this be a reasonable >simulation? > > > > > Hi, Samor. > I think the answer to your question is going to depend on what you mean by >"reasonable" and that will depend on the question you are actually trying to >answer. What is the question you are trying to answer? Are you trying to >perform hypothesis testing or algorithm development, or something else? > > > Sean > > > > > > > > > > > > [[alternative HTML version deleted]] > >