Dear all,
I have analyzed RNA-seq dataset, see the experimental design below.
The two questions we are trying to answer are:
Find DE genes between different conditions. It can be seen that each condition has 4 samples (biological replicates) for which I performed pairwise differential expression using DESeq2. e.g. (A-B), (A-C), (A-D), (A-E), (B-C) .... (D-E). I can see the DE genes.
Find DE genes and fold change difference between 2 Events X and Y as listed in column 3 in table below. In this case the unequal group sizes has raised some concerns by the reviewer of the study.
| Samples | Condition | Event | Replicates |
|---------- |----------- |------- |------------ |
| Sample1 | A | X | 1 |
| Sample2 | A | X | 2 |
| Sample3 | A | X | 3 |
| Sample4 | A | X | 4 |
| Sample5 | B | X | 1 |
| Sample6 | B | X | 2 |
| Sample7 | B | X | 3 |
| Sample8 | B | X | 4 |
| Sample9 | C | X | 1 |
| Sample10 | C | X | 2 |
| Sample11 | C | X | 3 |
| Sample12 | C | X | 4 |
| Sample13 | D | X | 1 |
| Sample14 | D | X | 2 |
| Sample15 | D | X | 3 |
| Sample16 | D | X | 4 |
| Sample17 | E | Y | 1 |
| Sample18 | E | Y | 2 |
| Sample19 | E | Y | 3 |
| Sample20 | E | Y | 4 |
To my understanding DESeq2 is able to calculate DE for uneven group sizes as it calculates the group means before fold change calculation. Also in an example in DESeq2 vignette, different number of replicates for each condition were used though the difference was not as high as in this study. I would like feedback on this design to improve the statistical analysis.
Best regards, Zohaib
