Question

Are DESeq2 significant results with 2 replicates per group, less than significant results with more replicates?

0

Entering edit mode

jovel_juan ▴ 30

@jovel_juan-7129

Last seen 19 months ago

Canada

We recently got a paper rejected for publication. Initially, two reviewers accepted the paper, and one reviewer rejected it, under the argument that having only two replicates per condition (control plants and plants infected with a fungus) does not produce reliable results. We argued that DESeq2 is capable of conducting differential expression analysis in experiments with only 3 samples (2 degrees of freedom), and since we had two replicates per group (n=4; 3 degrees of freedom) our experimental design was just fine. We also argued that there is a statistical penalty for having less replicates (less degrees of freedom), and that was reflected in a smaller number of differentially expressed genes. Our paper was sent to another reviewer and was again rejected under the same argument: a 2x2 experiment does no produce reliable results. We are not planning to object the desicion of the journal, but rather would like to hear the voice of experts regarding this issue, to avoid conducting experiments with only two replicates in the future, IF that is statistically non-reliable, as we were told by reviewers.

So, is an adjusted pvalue = 0.0001 with two replicates less trustable than the same adjusted p value with three or more replicates? enter image description here

I am adding a Figure of one of our comparisons. We had three groups. Control plants not inoculated with the fungus (Control), plants inoculated with a non-pathogenic strain of the fungus (Fo-npt), and plants inoculated with a pathogenic strain of the fungus (Fol-pt).

Thanks, any comment will be greatly appreciated.

deseq2 • 2.2k views

ADD COMMENT • link updated 5.2 years ago by Michael Love 43k • written 5.2 years ago by jovel_juan ▴ 30

score 0 · Answer 1 · 2020-10-12

You can refer to Schurch 2016, Figure 1 (here showing edgeR performance), the TPR is low for the smallest n but the FPR is flat. My only complaint with this paper is they should have used evaluated adjusted p-values with observed FDR and raw p-values with observed FPR (as that is how thresholding those statistics is designed to work), but you can get the picture from this.

Scurch