Re: technical replicates (again!): a summary
3
0
Entering edit mode
@gordon-smyth
Last seen 1 minute ago
WEHI, Melbourne, Australia
Microarray GO Cancer vsn Microarray GO Cancer vsn • 798 views
0
Entering edit mode
@johan-lindberg-581
Last seen 7.1 years ago
0
Entering edit mode
Ramon Diaz ★ 1.1k
@ramon-diaz-159
Last seen 7.1 years ago
Sorry, Gordon, you are right. My fault. In fact, wouldn't that be a good way to go, and prevent problems from convergence with REML, specially if we don't care much about the random effect and within subject replication is small (i.e., number tech. reps. small)? R. On Thursday 01 April 2004 00:18, Gordon Smyth wrote: > Hi Ramon, > > You've left out an important strategy, which I've suggested a couples of > times recently, which is to fit the technical replicates a fixed factor > rather than a random factor. > > Cheers > Gordon > > At 01:49 AM 1/04/2004, you wrote: > >Dear Gordon, Naomi, and BioC list, > > > >The issue of how to deal with technical replicates (such as those obtained > >when we do dye-swaps of the same biological samples in cDNA arrays) has > > come up in the BioC list several times. What follows is a short summary, > > with links to mails in BioC plus some questions/comments. > > > > > >There seem to be three major ways of approaching the issue: > > > > > >1. Treat the technical replicates as ordinary replicates > >************************************************************* > >E.g., Gordon Smyth in sept. 2003 > >(https://www.stat.math.ethz.ch/pipermail/bioconductor/2003-Septembe r/00240 > >5.html) > > > >However, this makes me (and Naomi Altman ---e.g., > >https://www.stat.math.ethz.ch/pipermail/bioconductor/2003-December/ 003340. > >html) > > > >uneasy (tech. reps. are not independent biological reps. which leads to > > the usual inflation of dfs and deflation of se). > > > >I guess part of the key to Gordon's suggestion is his comment that even if > >the > >s.e. are slightly underestimated, the ordering is close to being the > > optimal one. But I don't see why the ordering out to be much worse if we > > use the means of technical replicates as in 3. below. (Haven't done the > > math, but it seems that, specially in the pressence of strong tech. reps. > > covariance and small number of independent samples we ought to be better > > of using the means of the tech. reps). > > > > > >2. Mixed effects models with subject as random effect (e.g., via lme). > >******************************************************************* ******* > >**** > > > >In late August of 2003 I asked about these issues, and Gordon seemed to > > agree that trying the lme approach could be a way to go. > >(https://www.stat.math.ethz.ch/pipermail/bioconductor/2003-August/0 02224.h > >tml). > > > >However, in September, I posted an aswer and included code that, at least > > for some cases, shows potential problems with using lme when the number > > of technical replicates is small. > >(https://www.stat.math.ethz.ch/pipermail/bioconductor/2003-Septembe r/00243 > >0.html) > > > >Nevertheless, Naomi Altman reports using lme/mixed models in a couple of > >emails > >(https://www.stat.math.ethz.ch/pipermail/bioconductor/2003-December /003191 > >.html; > > > >(https://www.stat.math.ethz.ch/pipermail/bioconductor/2004-January/ 003481. > >html). > > > >After reading about randomizedBlock (package statmod) in a message in BioC > > (I think from Gordon), I have tried aggain the mixed models approach, > > since with tech. reps and no other random effects, we should be able to > > use > >randomizedBlock. Details in 5. below: > > > > > >3. Take the average of the technical replicates > >**************************************************** > >Except for being possibly conservative (and not estimating tech. reps. > >variance component), I think this is a "safe" procedure. > >This is what I have ended up doing routinely after my disappointing tries > >with > >lme > >(https://www.stat.math.ethz.ch/pipermail/bioconductor/2003-Septembe r/00243 > >0.html) and what Bill Kenworthy seemed to end up doing > >(https://www.stat.math.ethz.ch/pipermail/bioconductor/2004-January/ 003493. > >html). > > > >I think this is also what is done at least some times in literature (e.g., > >Huber et al., 2002, Bioinformatics, 18, S96--S104 [the vsn paper]). > > > >********* > > > >4. Dealing with replicates in future versions of limma > >*********************************************************** > > > >Now, in Sept. 2004 Gordon mentioned that an explicit treatment of tech. > > reps. would be in a future version of limma > >( > >https://www.stat.math.ethz.ch/pipermail/bioconductor/2003-September /002411 > >.html) and I wonder if Gordon meant via mixed-effects models, or some > > other way, or if there has been some progress in this area. > > > > > > > >5. Using randomizedBlock > >***************************** > >In a simple set up of control and treatment with dye-swaps, I have done > > some numerical comparisons of the outcome of a t-test on the mean of the > > technical replicates with lme and with randomizedBlock. [The function is > > attached]. The outcome of the t-test of the means of replicates and > > randomizedBlock, in terms of the t-statistic, tends to be the same (if we > > "positivize" the dye swaps). The only difference, then, lies in the df we > > then use to put a p-value on the statistic. But I don't see how we can > > use the dfs from randomizedBlock: they seem way too large. Where am I > > getting lost? > > > > > >Best, > > > > > >R. > > > > > > > >-- > >Ram?n D?az-Uriarte > >Bioinformatics Unit > >Centro Nacional de Investigaciones Oncol?gicas (CNIO) > >(Spanish National Cancer Center) > >Melchor Fern?ndez Almagro, 3 > >28029 Madrid (Spain) > >Fax: +-34-91-224-6972 > >Phone: +-34-91-224-6900 > > > >http://bioinfo.cnio.es/~rdiaz > >PGP KeyID: 0xE89B3462 > >(http://bioinfo.cnio.es/~rdiaz/0xE89B3462.asc) -- Ram?n D?az-Uriarte Bioinformatics Unit Centro Nacional de Investigaciones Oncol?gicas (CNIO) (Spanish National Cancer Center) Melchor Fern?ndez Almagro, 3 28029 Madrid (Spain) Fax: +-34-91-224-6972 Phone: +-34-91-224-6900 http://bioinfo.cnio.es/~rdiaz PGP KeyID: 0xE89B3462 (http://bioinfo.cnio.es/~rdiaz/0xE89B3462.asc)
0
Entering edit mode
If you treat technical replicates as fixed effects in an RCB with within block replicates, you get the wrong error term for testing the fixed effects. So, I think this only works if you then designate rep*treatment as the error. --Naomi At 11:27 AM 4/1/2004 +0200, Ramon Diaz-Uriarte wrote: >Sorry, Gordon, you are right. My fault. >In fact, wouldn't that be a good way to go, and prevent problems from >convergence with REML, specially if we don't care much about the random >effect and within subject replication is small (i.e., number tech. reps. >small)? > >R. > > >On Thursday 01 April 2004 00:18, Gordon Smyth wrote: > > Hi Ramon, > > > > You've left out an important strategy, which I've suggested a couples of > > times recently, which is to fit the technical replicates a fixed factor > > rather than a random factor. > > > > Cheers > > Gordon > > > > At 01:49 AM 1/04/2004, you wrote: > > >Dear Gordon, Naomi, and BioC list, > > > > > >The issue of how to deal with technical replicates (such as those obtained > > >when we do dye-swaps of the same biological samples in cDNA arrays) has > > > come up in the BioC list several times. What follows is a short summary, > > > with links to mails in BioC plus some questions/comments. > > > > > > > > >There seem to be three major ways of approaching the issue: > > > > > > > > >1. Treat the technical replicates as ordinary replicates > > >************************************************************* > > >E.g., Gordon Smyth in sept. 2003 > > >(https://www.stat.math.ethz.ch/pipermail/bioconductor/2003-Septem ber/00240 > > >5.html) > > > > > >However, this makes me (and Naomi Altman ---e.g., > > >https://www.stat.math.ethz.ch/pipermail/bioconductor/2003-Decembe r/003340. > > >html) > > > > > >uneasy (tech. reps. are not independent biological reps. which leads to > > > the usual inflation of dfs and deflation of se). > > > > > >I guess part of the key to Gordon's suggestion is his comment that even if > > >the > > >s.e. are slightly underestimated, the ordering is close to being the > > > optimal one. But I don't see why the ordering out to be much worse if we > > > use the means of technical replicates as in 3. below. (Haven't done the > > > math, but it seems that, specially in the pressence of strong tech. reps. > > > covariance and small number of independent samples we ought to be better > > > of using the means of the tech. reps). > > > > > > > > >2. Mixed effects models with subject as random effect (e.g., via lme). > > >***************************************************************** ********* > > >**** > > > > > >In late August of 2003 I asked about these issues, and Gordon seemed to > > > agree that trying the lme approach could be a way to go. > > >(https://www.stat.math.ethz.ch/pipermail/bioconductor/2003-August /002224.h > > >tml). > > > > > >However, in September, I posted an aswer and included code that, at least > > > for some cases, shows potential problems with using lme when the number > > > of technical replicates is small. > > >(https://www.stat.math.ethz.ch/pipermail/bioconductor/2003-Septem ber/00243 > > >0.html) > > > > > >Nevertheless, Naomi Altman reports using lme/mixed models in a couple of > > >emails > > >(https://www.stat.math.ethz.ch/pipermail/bioconductor/2003-Decemb er/003191 > > >.html; > > > > > >(https://www.stat.math.ethz.ch/pipermail/bioconductor/2004-Januar y/003481. > > >html). > > > > > >After reading about randomizedBlock (package statmod) in a message in BioC > > > (I think from Gordon), I have tried aggain the mixed models approach, > > > since with tech. reps and no other random effects, we should be able to > > > use > > >randomizedBlock. Details in 5. below: > > > > > > > > >3. Take the average of the technical replicates > > >**************************************************** > > >Except for being possibly conservative (and not estimating tech. reps. > > >variance component), I think this is a "safe" procedure. > > >This is what I have ended up doing routinely after my disappointing tries > > >with > > >lme > > >(https://www.stat.math.ethz.ch/pipermail/bioconductor/2003-Septem ber/00243 > > >0.html) and what Bill Kenworthy seemed to end up doing > > >(https://www.stat.math.ethz.ch/pipermail/bioconductor/2004-Januar y/003493. > > >html). > > > > > >I think this is also what is done at least some times in literature (e.g., > > >Huber et al., 2002, Bioinformatics, 18, S96--S104 [the vsn paper]). > > > > > >********* > > > > > >4. Dealing with replicates in future versions of limma > > >*********************************************************** > > > > > >Now, in Sept. 2004 Gordon mentioned that an explicit treatment of tech. > > > reps. would be in a future version of limma > > >( > > >https://www.stat.math.ethz.ch/pipermail/bioconductor/2003-Septemb er/002411 > > >.html) and I wonder if Gordon meant via mixed-effects models, or some > > > other way, or if there has been some progress in this area. > > > > > > > > > > > >5. Using randomizedBlock > > >***************************** > > >In a simple set up of control and treatment with dye-swaps, I have done > > > some numerical comparisons of the outcome of a t-test on the mean of the > > > technical replicates with lme and with randomizedBlock. [The function is > > > attached]. The outcome of the t-test of the means of replicates and > > > randomizedBlock, in terms of the t-statistic, tends to be the same (if we > > > "positivize" the dye swaps). The only difference, then, lies in the df we > > > then use to put a p-value on the statistic. But I don't see how we can > > > use the dfs from randomizedBlock: they seem way too large. Where am I > > > getting lost? > > > > > > > > >Best, > > > > > > > > >R. > > > > > > > > > > > >-- > > >Ram?n D?az-Uriarte > > >Bioinformatics Unit > > >Centro Nacional de Investigaciones Oncol?gicas (CNIO) > > >(Spanish National Cancer Center) > > >Melchor Fern?ndez Almagro, 3 > > >28029 Madrid (Spain) > > >Fax: +-34-91-224-6972 > > >Phone: +-34-91-224-6900 > > > > > >http://bioinfo.cnio.es/~rdiaz > > >PGP KeyID: 0xE89B3462 > > >(http://bioinfo.cnio.es/~rdiaz/0xE89B3462.asc) > >-- >Ram?n D?az-Uriarte >Bioinformatics Unit >Centro Nacional de Investigaciones Oncol?gicas (CNIO) >(Spanish National Cancer Center) >Melchor Fern?ndez Almagro, 3 >28029 Madrid (Spain) >Fax: +-34-91-224-6972 >Phone: +-34-91-224-6900 > >http://bioinfo.cnio.es/~rdiaz >PGP KeyID: 0xE89B3462 >(http://bioinfo.cnio.es/~rdiaz/0xE89B3462.asc) > >_______________________________________________ >Bioconductor mailing list >Bioconductor@stat.math.ethz.ch >https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
0
Entering edit mode
Ramon Diaz ★ 1.1k
@ramon-diaz-159
Last seen 7.1 years ago