Correct use of Nested/Stage-wise testing with limma/stageR for 3-group design.
1
0
Entering edit mode
maltethodberg ▴ 140
@maltethodberg-9690
Last seen 9 weeks ago
Denmark

I'm analysing an RNA-Seq dataset with the following study design: Three locations (e.g. A, B, C) at two different times (e.g. time 1 and 2), everything in triplicates. I'm interested in finding DE genes within each time point: A1 vs B1, A1 vs C1, B1 vs C1, and similarly for the second time point.

I set limma up for doing all the pairwise combinations between location-time combinations (~0+Group and makeContrast). I can then extract both t-tests for each pairwise comparison with topTable(coef="A1-B1") or F-tests for all pairwise comparisons using topTable(coef=c("A1-B1", "A1-C1", "B1-C1")), etc.

I'm now considering if it would be appropriate to use some of the more advanced options for correcting for multiple testing in decideTests in a scenario like this where three closely related contrasts are analyzed. Would the nestedF method be appropriate and how would one handle having to different F-tests in a single setup (all comparisons within timepoint 1 and all comparisons within timepoint 2) as only one F.p.value is stored in an MArrayLM-object?

Another option would be to use the recent stageR package (https://genomebiology.biomedcentral.com/articles/10.1186/s13059-017-1277-0). In the discussion section of the paper they discuss a similar study setup:

"For example, a DGE study that compares three drugs (e.g. a new drug, the current state of the art and a placebo) would require exactly the same data analysis paradigm as the Hammer dataset: three different hypotheses of interest (mean differential expression between the drugs) and, according to Shaffer’s modified sequentially rejective Bonferroni (MSRB) procedure, no correction is needed in stage II for FWER control."

Would an appropriate implementation of this be to extract F-test p-values using topTable to use as the screening-tests and use individual t-test p-values for confirmation tests? Would a stage II correction be necessary in this case?

limma stageR decideTests • 531 views
3
Entering edit mode
@koen-van-den-berge-6369
Last seen 9 days ago
Ghent University, Belgium

If you would want to analyze both timepoints and all treatments in a single stage-wise analysis, you would have two options:

• A three-stage analysis: (i) Global screening test to assess differences between any of the treatments in any of the timepoints. (ii) Conditional on rejecting (i) you may perform two screening tests, one for each timepoint, to assess in which timepoints there are differences to be discovered. (iii) confirmation stage for each timepoint separately, conditional on rejecting (i) and (ii) for that timepoint. This is currently not yet possible within stageR, but we are looking into multiple testing procedures that can take into account the three stages without increasing the multiple testing correction too dramatically.
• A two-stage analysis: (i) Global screening test to assess differences between any of the treatments in any of the timepoints. (ii) Confirmation stage across all hypotheses, i.e. all between-treatment differences across both timepoints. In stageR, this can be performed by allowing pScreen to be the F-test p-value for all treatment differences across both timepoints, and pConfirmation the six specific hypotheses of interest.

Both procedures should control the gene-level FDR at 5% across the entire analysis. Note that for the second option, you are now no longer automatically taking full advantage of the logical relationships between the hypotheses (however, we are also looking into implementing this such that taking full advantage of this in the multiple testing correction occurs automatically), as you bring forward, since the statement that you mention is no longer correct across multiple timepoints. (Note that it would, however, be correct if you would analyze each timepoint separately in stageR, i.e. assessing DE for every timepoint on a 5% gene-level FDR).

0
Entering edit mode

For the second option, you can achieve it by using method = "global" in the decideTests function, right? (Provided that he/she used the factorized contrast matrix).

0
Entering edit mode

Thanks for the detailed comment! I don't think the the two-stage approach you suggest would be entirely appropriate here, as the two timepoints have a very different number of pairwise differences (more than twice as many difference between sites on the segment time point).

Would it still be correct to analyse the data using two stageR analyses, where the approach is: 1) Extract F-test p-values for each time point 2) adjust t-test p-values using stageWiseAdjustment with method="none"?

1
Entering edit mode

Yes, two stageR analyses would be appropriate in that case, and you are correct that according to Shaffer's procedure, you are allowed to not adopt a FWER correction in the confirmation stage, so indeed method="none" will be valid here.