First, I know that DESeq2 is not meant to be run without replicates and results will not be meaningful. I am an analyst who has the following design matrix:
Group | Condition |
---|---|
A | 1 |
B | 1 |
C | 1 |
A | 2 |
B | 2 |
C | 2 |
A | 3 |
B | 3 |
C | 3 |
I ran DESeq2 with LRT of ~Group + Condition vs ~Group to identify all DE features significant due to Condition. However my collaborators are interested in if any additional features are differentially expressed between Group B Condition 1 and Group B Condition 2. I understand this should be an interaction term, however there are no replicates and the experiment can't be repeated for reasons out of my control. One way to look at Group B Condition 1 vs Group B Condition 2 I could think of would be to normalize the counts for each, compute the log fold change, and see if there are any genes with greater log fold change than some predefined threshold or upper quartile of the significant genes log fold change that are not already significant. Another would be to create a fake Condition that is set up so that instead the design matrix is
Group | Condition |
---|---|
A | 1 |
B | interest1 |
C | 1 |
A | 2 |
B | interest2 |
C | 2 |
A | 3 |
B | 3 |
C | 3 |
This will allow for a contrast between interest1 and interest2 although all of the pvalues will be close to 1. And then would do the same, see if any have log fold change greater than a preset threshold or upper quartile of significant genes. I believe this approach would still take into account gene-wise dispersion and sample-wise size factors. So, why is this a bad idea that I shouldn't do?