Question

F-statistic in limma

0

Entering edit mode

James W. MacDonald 65k

@james-w-macdonald-5106

Last seen 4 days ago

United States

Edmund Chang wrote: > Hi Jim, > Thank you very much for the many tips! I really do appreciate it. I do > have another question about p-values associated with F-statistics. > results<- decideTests(fit.all, method="nestedF", adj="none" > > In the limma user guide, it says the following about F-statistic (Sec 18): > > "... In a complex experiment with many contrasts, it may be desirable to > select genes firstly on the basis of their moderated F-statistics, and > subsequently to decide which of the individual contrasts are significant > for the selected genes. This cuts down on the number of tests which > need to be conducted and therefore on the amount of adjustment for > multiple testing. The function decideTests() with method="nestedF" is > able to conduct such tests." > > If I do > results<- decideTests(fit.all, method="nestedF", adj="none") > > I take this to mean that I can select the genes with either +1 or -1 in > results$Res.contrastcoefficient I think if you do something like that R will return NULL because your results object isn't a list(), nor does it have anything in it called Res.contrastcoefficient. Maybe an earlier version of limma used a list for the TestResults object, but the version I have uses an S4 object with only one slot. Anyway, I think what is meant by the quote from the limma user's guide is this; if you do decideTests() with method = "separate" or "hierarchical" or "global", you are from the outset deciding to do a fixed bunch of contrasts and will have to adjust for all the resulting comparisons. In contrast, "nestedF" looks at each individual significant F-statistic and then tries to decide which contrast(s) contributed to that significant result. As an example, say geneX has a significant F-statistic, and it is due to only one contrast (from a total of say, four). If you select 'nestedF', it will find that first contrast and see that it is significant, but then won't make any more comparisons because it will be able to tell that the other contrasts didn't contribute to the overall significance. Because 'nestedF' stops when the remaining contrasts are not significant, you do less comparisons than the other methods, and thus have to adjust less for multiplicity. So, long story short, 'nestedF' doesn't allow you to make any choices as to what genes to test, it simply does fewer tests (well, technically I think the upper bound is the same as 'hierarchical', but I don't think it is likely that this would ever happen). HTH, Jim and then do multiple-testing correction > (using say q-value?) on the unadjusted p-values my contrast-of- interest > (which I would then choose some cutoff). I am wondering if application > of multiple-testing correction in this fashion would underestimate the > true FDR (rather than running q-value of on the entire set of genes > regardless of how they contribute to the F-statistic?) > > Thank you for your time, > Edmund > -- James W. MacDonald University of Michigan Affymetrix and cDNA Microarray Core 1500 E Medical Center Drive Ann Arbor MI 48109 734-647-5623 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.

Microarray limma Microarray limma • 2.5k views

ADD COMMENT • link updated 18.2 years ago by echang4@life.uiuc.edu ▴ 110 • written 18.2 years ago by James W. MacDonald 65k

score 0 · Answer 1 · 2006-03-03

Hi Jim, Thank you very much for the many tips! I really do appreciate it. I do have another question about p-values associated with F-statistics. results<- decideTests(fit.all, method="nestedF", adj="none" In the limma user guide, it says the following about F-statistic (Sec 18): "... In a complex experiment with many contrasts, it may be desirable to select genes firstly on the basis of their moderated F-statistics, and subsequently to decide which of the individual contrasts are significant for the selected genes. This cuts down on the number of tests which need to be conducted and therefore on the amount of adjustment for multiple testing. The function decideTests() with method="nestedF" is able to conduct such tests." If I do results<- decideTests(fit.all, method="nestedF", adj="none") I take this to mean that I can select the genes with either +1 or -1 in results$Res.contrastcoefficient and then do multiple-testing correction (using say q-value?) on the unadjusted p-values my contrast-of- interest (which I would then choose some cutoff). I am wondering if application of multiple-testing correction in this fashion would underestimate the true FDR (rather than running q-value of on the entire set of genes regardless of how they contribute to the F-statistic?) Thank you for your time, Edmund James W. MacDonald wrote: > echang4 at life.uiuc.edu wrote: >> Hi Bioconductor users, >> I am having trouble understanding how multiple-testing adjustment is >> done >> in limma (specifically the decideTest). I am really confused about the >> meaning between moderated F p-value and the adjusted p-value. >> >> If I try two different "flavors" of decideTest (e.g. nestedF and >> global), >> I can see that the results are different.... but is it the p-value >> that is >> adjusted or the F-statistic is adjusted? > > The statistics themselves are never adjusted in decideTests(), only > the p-values. > > The difference is in how the p-values are adjusted. For the 'global' > option, all the contrasts are considered to be independent, and the > p-values are adjusted as if you just had a bunch of independent t-tests. > > The nestedF option is a bit more complicated. First, a bit of > background. The F-statistic is used to determine if there are any > differences between the samples, but it doesn't tell you which > sample(s) are different. You have to fit contrasts to find out which > sample(s) are different. > > So the idea with the nestedF is to adjust the p-values associated with > the F-test to find which genes are differentially expressed in at > least one sample. Now we have a list of genes that are differentially > expressed, but we don't know for which sample(s) that may be true. The > t-statistics associated with the contrasts are then inspected and the > largest one (in absolute value) is considered significant. Now, there > may be other contrasts that are significant as well, so the largest > t-statistic is set to the same absolute value as the second largest > t-statistic, and the F-statistic is calculated again. If the > F-statistic is still significant, the second largest contrast is > considered significant. This procedure is continued until the > F-statistic is no longer significant. > > The basic reasoning here is that the largest t-statistic for a set of > contrasts is significant if the overall F-statistic is significant. By > following this step-wise procedure, we can determine which contrasts > are contributing to the overall significance of the F-statistic. >