I apologise for the naive question but after reading the DESeq2 paper and vignette, I am still not sure I fully understand how the method works.
My questions are as follows:
As one dispersion parameter is calculated per gene, does the calculation ignore the group membership of each sample, and is this also true for the mean parameter?
Is a single negative binomial model fit per gene, so this assumes that the distribution of counts for each condition e.g. genotype is the same, or have I misunderstood this?
What is the reason that you can't just perform a t-test comparison? Is the reason for generating a negative binomial model so that you can estimate the variance you would expect if you collected more samples?
How does the Wald test work to identify DEGs? Does it essentially look at where the samples from a given condition lie within the NB distribution for that gene?
Thanks for any explanations/clarifications. My lack of stats training is really hindering me here!