Question

should subgroups of comparisons be analyzed separately in Limma

0

Entering edit mode

raf4 ▴ 20

@raf4-8249

Last seen 15 months ago

United States

Dear Bioconductor,

I have a fluidigm (or other array) experiment which consists of 4 groups: A, B, C, and D.

The experimentalist I am helping wants to compare B to A and D to C,

but not B to D or B to C, etc.

Do I

1. Run all 4 groups together in Limma and then analyze the individual contrasts

OR

2. do I compare B to A and D to C in 2 separate Limma runs?

I believe that 1 is correct, because in an ANOVA one should take into account

the variability of all of the samples, Furthermore, as I understand Limma, more

samples improves the empirical Bayesian estimate of the variance. However, this

approach has recently been questioned by 3 different experimental collaborators,

from 3 different labs, in 3 different contexts, so, I think that it would be prudent to ask the list.

Thanks and best wishes,

Rich

Richard Friedman

limma design matrix • 936 views

ADD COMMENT • link updated 6.4 years ago by Gavin Kelly ▴ 680 • written 6.4 years ago by raf4 ▴ 20

score 1 · Answer 1 · 2017-11-24

It is a judgement call, and your situation is one that many of us can sympathise with. (1) is the 'correct' approach from a statistical point of view, if you've no reason to doubt that the variance in A and in B is roughly similar to the variance in C and in D. You get more power that way, and the (unmoderated) fold-change estimates are identical to what you'd get with the other approach, it's just that you have more samples to use to estimate dispersion (and estimating this is hard, so it's best to have as many samples as possible involved).

The issue might be if the design is really two experiments cobbled together, so you'd be quite entitled to expect one branch to more variable than the other. Or where there's a 'treatment' that is much more noise-inducing than another (e.g. normal and tumour, or preponderance of 'outlier samples' in one treatment) and all you're interested in is separate results within normal and within tumour - but as soon as you want to compare across the two branches, you're back to (1). A PCA plot can sometimes provide insight one way or the other about the equality of variability.

Another common situation is for experimentalists to ask for A vs B, and C vs D (however you decide), and then take the results away and do a 'venn diagram in excel' analysis, which I always warn against (often preferable to do an interaction test - difference of significance not being the same as significant differences ) but sometimes have to allow, if the biology is really requiring 'no change' in one branch.

Then you've the problem of persuading the scientist. I often ask them to explain why they struggled to control noise (or follow protocols) in one branch of their experiment as well as they did in the other branch; this is quite persuasive.