Question: edgeR DEG how many treatments should be fitted to GLM
29 days ago
I have questions about how many treatments should be fitted to GLM. Say, I have A, B, C, D 4 treatments, and 1 control, each treatment (including control) has 3 replicates. All treatments and control are the same species. I want to compare A against B and A against control. I am hesitating between two options: first is fit ONLY A, B and control to generlised linear model (GLM) and compare them; second is fit all A, B, C, D and control to GLM, compare any combination of treatment/control pairs, and extract comparisons of A-B and A-control.

Are both reasonable? Which is better? What difference can be expected between the two options?


Answer: edgeR DEG how many treatments should be fitted to GLM
29 days ago
Gordon Smyth
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Both approaches are reasonable.

If all treatments A-D were part of the same experiment, done on the same type of cells at the same time, then fitting a model to all the data at once is usually better. This is because it provides more samples from which to estimate variability for each gene.

If however C and D are very different from A and B for some reason, for example being profiles of a different cell type, then just analyzing A and B alone would usually be preferable. In this case, the variability of replicate samples for treatments C and D might not be representative of what we would expect for A and B.

Good answer. Thanks a lot!

