in my data, I have two groups: a treatment group and a control group. Both groups have two biological replicates (i.e., two replicated populations): "1" and "2". I would like to nest the replicates within the main groups OR set the replicates as a random effect when running the DESeq2. I am not particularly interested in differences between the replicates (we would like to assume that there are none) but I have to take replicate into account in the model. However, I don't know how to run a model with a random effect in it and I just set the group and replicate as main effects. When doing that, can I simply have two columns in my design: one for treatment ("treated","control") and one for replicate ("1","2")? Can I be sure then that in my model (counts ~ replicate + treatment) the replicates are nested within the treatment? Or should I code my replicates as "1", "2", "3", "4" ("1" and "2" within "treated" and "3" and "4" within "control")?
This is probably a silly question but I would really like to be sure that my design is correct.
DESeq2 only allows for fixed effects. Because you are interested in comparing *across* replicates, you can't simultaneously fit the replicate and condition effects (e.g., the treatment effect is collinear with replicate 3 + 4). Your choices are then: (i) you can assume that the treatment effect is (rep3+rep4)/2 - (rep1+rep2)/2, which seems reasonable, or (ii) you can just fit the condition effect (where variance across replicates within condition will increase the dispersion estimates).
For (i) you would use a design of ~replicate and then:
Also, I point out when people have experimental designs like this that with limma you can inform the model of correlations among samples (e.g. samples from replicate 1 are correlated) but without including a fixed effect in the model. See the duplicateCorrelation function in the limma package.