F-test vs.T-test-on-differences

0

Entering edit mode

Claus Mayer ▴ 340

@claus-mayer-1179

Last seen 11.4 years ago

European Union

Hello Benjamin! I think there is some misunderstanding here. The t-test is a test for the differences between the means of two distributions. If you center your data like you propose the difference is 0 so the t-statistic will always behave very much like under the nullhypothesis (not exactly as the distributions might differ in variances and other properties, but the t-test is NOT meant to detect those). The F-test on the other hand specifically tests for difference in variances, so it is clearly the more appropriate test in your case (and if you are worrried about non-normality you might determine p-values by a resampling method like bootstrap). I think what might have confused you is that there are TWO F-tests: a) the one for testing differences between variances (lets call that F1) b) the F-test that is being used in Analysis of Variance (ANOVA) (lets call it F2) Despite its name ANOVA is a method to compare MEANS not VARIANCES. With two groups you have the trivial case of a one-way ANOVA and if you calculate the F-statistic F2 for that it is just a transformation of the usual t-statistic, i.e. the test will yield the same p-values. So F1 and F2 are very different statistics for very different things, but both have a F-distribution under normality assumptions so their names are the same (there are plenty of chi-square tests out there as well!) Hope this helps Claus Benjamin Otto wrote: > Dear community, > > > > That might be a stupid statistical question but I'm really not sure about > the answer: > > > > Suppose I have two groups of numeric values x11-x19 and y11-y19. The > conventional way to check for difference in variance here is performing an > F-test with say > > > >> g1 <- c(x11:x19) > >> g2 <- c(y11:y19) > >> var.test( g1, g2) > > > > and looking at the resuting p.value. A second possibility is calculating > some adjusted values first like > > > >> g1.adj <- abs(g1 - mean(g1)) > >> g2.adj <- abs(g2 - mean(g2)) > > > > And afterwards performing a T-test on those values. Should that give me the > same result? I tried to solve it mathematically and the statistic doesn't > seem to be the same. But then, why is the F-test calculated as it is AND is > it really better for a comparison than the second version? > > > > Regards, > > > > benjamin > > > > -- > Benjamin Otto > Universitaetsklinikum Eppendorf Hamburg > Institut fuer Klinische Chemie > Martinistrasse 52 > 20246 Hamburg > > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > -- ********************************************************************** ************* Dr Claus-D. Mayer | http://www.bioss.ac.uk Biomathematics & Statistics Scotland | email: claus at bioss.ac.uk Rowett Research Institute | Telephone: +44 (0) 1224 716652 Aberdeen AB21 9SB, Scotland, UK. | Fax: +44 (0) 1224 715349

• 1.7k views

ADD COMMENT • link updated 19.3 years ago by Naomi Altman ★ 6.0k • written 19.3 years ago by Claus Mayer ▴ 340

0

Entering edit mode

Naomi Altman ★ 6.0k

@naomi-altman-380

Last seen 4.8 years ago

United States

Actually, since Benjamin took abs(x-xbar) the means are not the same. abs(x-sbar) should be centered roughly on SD(x). --Naomi At 04:15 AM 11/1/2006, Claus Mayer wrote: >Hello Benjamin! > >I think there is some misunderstanding here. The t-test is a test for >the differences between the means of two distributions. If you center >your data like you propose the difference is 0 so the t-statistic will >always behave very much like under the nullhypothesis (not exactly as >the distributions might differ in variances and other properties, but >the t-test is NOT meant to detect those). >The F-test on the other hand specifically tests for difference in >variances, so it is clearly the more appropriate test in your case (and >if you are worrried about non-normality you might determine p-values by >a resampling method like bootstrap). >I think what might have confused you is that there are TWO F-tests: >a) the one for testing differences between variances (lets call that F1) >b) the F-test that is being used in Analysis of Variance (ANOVA) (lets >call it F2) >Despite its name ANOVA is a method to compare MEANS not VARIANCES. With >two groups you have the trivial case of a one-way ANOVA and if you >calculate the F-statistic F2 for that it is just a transformation of the >usual t-statistic, i.e. the test will yield the same p-values. >So F1 and F2 are very different statistics for very different things, >but both have a F-distribution under normality assumptions so their >names are the same (there are plenty of chi-square tests out there as well!) > >Hope this helps > >Claus > >Benjamin Otto wrote: > > Dear community, > > > > > > > > That might be a stupid statistical question but I'm really not sure about > > the answer: > > > > > > > > Suppose I have two groups of numeric values x11-x19 and y11-y19. The > > conventional way to check for difference in variance here is performing an > > F-test with say > > > > > > > >> g1 <- c(x11:x19) > > > >> g2 <- c(y11:y19) > > > >> var.test( g1, g2) > > > > > > > > and looking at the resuting p.value. A second possibility is calculating > > some adjusted values first like > > > > > > > >> g1.adj <- abs(g1 - mean(g1)) > > > >> g2.adj <- abs(g2 - mean(g2)) > > > > > > > > And afterwards performing a T-test on those values. Should that give me the > > same result? I tried to solve it mathematically and the statistic doesn't > > seem to be the same. But then, why is the F-test calculated as it is AND is > > it really better for a comparison than the second version? > > > > > > > > Regards, > > > > > > > > benjamin > > > > > > > > -- > > Benjamin Otto > > Universitaetsklinikum Eppendorf Hamburg > > Institut fuer Klinische Chemie > > Martinistrasse 52 > > 20246 Hamburg > > > > > > > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor at stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > > > > > > > >-- >********************************************************************* ************** > Dr Claus-D. Mayer | http://www.bioss.ac.uk > Biomathematics & Statistics Scotland | email: claus at bioss.ac.uk > Rowett Research Institute | Telephone: +44 (0) 1224 716652 > Aberdeen AB21 9SB, Scotland, UK. | Fax: +44 (0) 1224 715349 > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111

ADD COMMENT • link 19.3 years ago Naomi Altman ★ 6.0k

0

Entering edit mode

Which only shows that one should read these things properly before one replies. Very sorry about that! I haven't come across that approach as a test for differences in variances yet, but I can see the idea now. As the F-test has optimality properties for normal distributions I still would prefer it (possibly performed as a resampling test to make it more robust against deviations of non-normality). Sorry again for misreading and misinterpreting the question Claus Naomi Altman wrote: > Actually, since Benjamin took abs(x-xbar) the means are not the same. > abs(x-sbar) should be centered roughly on SD(x). > > --Naomi > > At 04:15 AM 11/1/2006, Claus Mayer wrote: >> Hello Benjamin! >> >> I think there is some misunderstanding here. The t-test is a test for >> the differences between the means of two distributions. If you center >> your data like you propose the difference is 0 so the t-statistic will >> always behave very much like under the nullhypothesis (not exactly as >> the distributions might differ in variances and other properties, but >> the t-test is NOT meant to detect those). >> The F-test on the other hand specifically tests for difference in >> variances, so it is clearly the more appropriate test in your case (and >> if you are worrried about non-normality you might determine p-values by >> a resampling method like bootstrap). >> I think what might have confused you is that there are TWO F-tests: >> a) the one for testing differences between variances (lets call that F1) >> b) the F-test that is being used in Analysis of Variance (ANOVA) (lets >> call it F2) >> Despite its name ANOVA is a method to compare MEANS not VARIANCES. With >> two groups you have the trivial case of a one-way ANOVA and if you >> calculate the F-statistic F2 for that it is just a transformation of the >> usual t-statistic, i.e. the test will yield the same p-values. >> So F1 and F2 are very different statistics for very different things, >> but both have a F-distribution under normality assumptions so their >> names are the same (there are plenty of chi-square tests out there as >> well!) >> >> Hope this helps >> >> Claus >> >> Benjamin Otto wrote: >> > Dear community, >> > >> > >> > >> > That might be a stupid statistical question but I'm really not sure >> about >> > the answer: >> > >> > >> > >> > Suppose I have two groups of numeric values x11-x19 and y11-y19. The >> > conventional way to check for difference in variance here is >> performing an >> > F-test with say >> > >> > >> > >> >> g1 <- c(x11:x19) >> > >> >> g2 <- c(y11:y19) >> > >> >> var.test( g1, g2) >> > >> > >> > >> > and looking at the resuting p.value. A second possibility is >> calculating >> > some adjusted values first like >> > >> > >> > >> >> g1.adj <- abs(g1 - mean(g1)) >> > >> >> g2.adj <- abs(g2 - mean(g2)) >> > >> > >> > >> > And afterwards performing a T-test on those values. Should that give >> me the >> > same result? I tried to solve it mathematically and the statistic >> doesn't >> > seem to be the same. But then, why is the F-test calculated as it is >> AND is >> > it really better for a comparison than the second version? >> > >> > >> > >> > Regards, >> > >> > >> > >> > benjamin >> > >> > >> > >> > -- >> > Benjamin Otto >> > Universitaetsklinikum Eppendorf Hamburg >> > Institut fuer Klinische Chemie >> > Martinistrasse 52 >> > 20246 Hamburg >> > >> > >> > >> > >> > [[alternative HTML version deleted]] >> > >> > _______________________________________________ >> > Bioconductor mailing list >> > Bioconductor at stat.math.ethz.ch >> > https://stat.ethz.ch/mailman/listinfo/bioconductor >> > Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > >> > >> > >> > >> > >> >> -- >> ******************************************************************* **************** >> >> Dr Claus-D. Mayer | http://www.bioss.ac.uk >> Biomathematics & Statistics Scotland | email: claus at bioss.ac.uk >> Rowett Research Institute | Telephone: +44 (0) 1224 716652 >> Aberdeen AB21 9SB, Scotland, UK. | Fax: +44 (0) 1224 715349 >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > Naomi S. Altman 814-865-3791 (voice) > Associate Professor > Dept. of Statistics 814-863-7114 (fax) > Penn State University 814-865-1348 (Statistics) > University Park, PA 16802-2111 > > > > > > -- ********************************************************************** ************* Dr Claus-D. Mayer | http://www.bioss.ac.uk Biomathematics & Statistics Scotland | email: claus at bioss.ac.uk Rowett Research Institute | Telephone: +44 (0) 1224 716652 Aberdeen AB21 9SB, Scotland, UK. | Fax: +44 (0) 1224 715349

ADD REPLY • link 19.3 years ago Claus Mayer ▴ 340

0

Entering edit mode

Hi Naomi, Claus, The distribution argument you both mentioned seems convincing. Maybe I really should stick to the normal F-test for my comparisons. Regards, Benjamin -----Urspr?ngliche Nachricht----- Von: Claus Mayer [mailto:claus at bioss.ac.uk] Gesendet: 01 November 2006 19:09 An: Naomi Altman; 'BioClist'; Benjamin Otto Betreff: Re: [BioC] F-test vs.T-test-on-differences Which only shows that one should read these things properly before one replies. Very sorry about that! I haven't come across that approach as a test for differences in variances yet, but I can see the idea now. As the F-test has optimality properties for normal distributions I still would prefer it (possibly performed as a resampling test to make it more robust against deviations of non-normality). Sorry again for misreading and misinterpreting the question Claus Naomi Altman wrote: > Actually, since Benjamin took abs(x-xbar) the means are not the same. > abs(x-sbar) should be centered roughly on SD(x). > > --Naomi > > At 04:15 AM 11/1/2006, Claus Mayer wrote: >> Hello Benjamin! >> >> I think there is some misunderstanding here. The t-test is a test for >> the differences between the means of two distributions. If you center >> your data like you propose the difference is 0 so the t-statistic will >> always behave very much like under the nullhypothesis (not exactly as >> the distributions might differ in variances and other properties, but >> the t-test is NOT meant to detect those). >> The F-test on the other hand specifically tests for difference in >> variances, so it is clearly the more appropriate test in your case (and >> if you are worrried about non-normality you might determine p-values by >> a resampling method like bootstrap). >> I think what might have confused you is that there are TWO F-tests: >> a) the one for testing differences between variances (lets call that F1) >> b) the F-test that is being used in Analysis of Variance (ANOVA) (lets >> call it F2) >> Despite its name ANOVA is a method to compare MEANS not VARIANCES. With >> two groups you have the trivial case of a one-way ANOVA and if you >> calculate the F-statistic F2 for that it is just a transformation of the >> usual t-statistic, i.e. the test will yield the same p-values. >> So F1 and F2 are very different statistics for very different things, >> but both have a F-distribution under normality assumptions so their >> names are the same (there are plenty of chi-square tests out there as >> well!) >> >> Hope this helps >> >> Claus >> >> Benjamin Otto wrote: >> > Dear community, >> > >> > >> > >> > That might be a stupid statistical question but I'm really not sure >> about >> > the answer: >> > >> > >> > >> > Suppose I have two groups of numeric values x11-x19 and y11-y19. The >> > conventional way to check for difference in variance here is >> performing an >> > F-test with say >> > >> > >> > >> >> g1 <- c(x11:x19) >> > >> >> g2 <- c(y11:y19) >> > >> >> var.test( g1, g2) >> > >> > >> > >> > and looking at the resuting p.value. A second possibility is >> calculating >> > some adjusted values first like >> > >> > >> > >> >> g1.adj <- abs(g1 - mean(g1)) >> > >> >> g2.adj <- abs(g2 - mean(g2)) >> > >> > >> > >> > And afterwards performing a T-test on those values. Should that give >> me the >> > same result? I tried to solve it mathematically and the statistic >> doesn't >> > seem to be the same. But then, why is the F-test calculated as it is >> AND is >> > it really better for a comparison than the second version? >> > >> > >> > >> > Regards, >> > >> > >> > >> > benjamin >> > >> > >> > >> > -- >> > Benjamin Otto >> > Universitaetsklinikum Eppendorf Hamburg >> > Institut fuer Klinische Chemie >> > Martinistrasse 52 >> > 20246 Hamburg >> > >> > >> > >> > >> > [[alternative HTML version deleted]] >> > >> > _______________________________________________ >> > Bioconductor mailing list >> > Bioconductor at stat.math.ethz.ch >> > https://stat.ethz.ch/mailman/listinfo/bioconductor >> > Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > >> > >> > >> > >> > >> >> -- >> ********************************************************************** ****** ******* >> >> Dr Claus-D. Mayer | http://www.bioss.ac.uk >> Biomathematics & Statistics Scotland | email: claus at bioss.ac.uk >> Rowett Research Institute | Telephone: +44 (0) 1224 716652 >> Aberdeen AB21 9SB, Scotland, UK. | Fax: +44 (0) 1224 715349 >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > Naomi S. Altman 814-865-3791 (voice) > Associate Professor > Dept. of Statistics 814-863-7114 (fax) > Penn State University 814-865-1348 (Statistics) > University Park, PA 16802-2111 > > > > > > -- ********************************************************************** ****** ******* Dr Claus-D. Mayer | http://www.bioss.ac.uk Biomathematics & Statistics Scotland | email: claus at bioss.ac.uk Rowett Research Institute | Telephone: +44 (0) 1224 716652 Aberdeen AB21 9SB, Scotland, UK. | Fax: +44 (0) 1224 715349 ********************************************************************** ****** *******

ADD REPLY • link 19.3 years ago Benjamin Otto ▴ 830

Login before adding your answer.