I have 12 RNA-seq samples: 3 replicates each of male control, male mutant, female control, and female mutant. I want a list of genes that are significantly differentially expressed in male (mutant vs control) and female (mutant vs control). I'm not interested in comparing for example male control vs female control though I may do something like a Venn diagram of genes that are differentially expressed in both males and females. Should I do two independent analyses for male and female, or combine everything together (ie. all samples in the same summarizedExperiment and DeseqDataSet) and then use contrasts to specify the two comparisons (ie. contrast=c("Group","male.knockout","male.control")) and contrast=c("Group","female.knockout","female.control")))?
Generally you get better statistical power if you have all the samples in the same dataset, as you're estimating the variance across many more degrees of freedom. The caveat is that if one half of your experiment has, for biological or technical reasons, a different degree of variability, or a greater propensity for samples to be outliers, then the combined approach will over- and under- represent the variability depending on which half of the experiment you're looking at. But my intuition would be that this doesn't look like one of those situations. You can get some feel by looking at PCA plots or clusterings - if in one branch of the experiment the clusters are much tighter than the other branch, then you might want to try both approaches and see if positive control genes are better in one case than the other.
Another reason for doing the combined approach is that it will let you do an 2x2 design with interactions, to look at different response to KO between the sexes without having to resort to a venn-diagram-like approach (which often suffers due to two rounds of statistical error).