Dear all,

I believe I am using the DESEQ2 Formula correct but after getting some unexpected results I would like to double check with you:

I want to check the effect of Vitamin D on patients and I have a group who takes Vitamin D and a group who doesn't take Vitamin D. Since these groups are quite different I need to adjust for Gender, Age, Batch and BMI.

My formula is:

dds <- DESeqDataSetFromMatrix(countData = cts, colData = colData, design ~ batch + Gender + Age + BMI + Vitamin_D_Group

I understand that I need to put the factor I want to analyse at the end and the other variables I am adjusting for in the beginning of the formula, following this "With no additional arguments to results, the log2 fold change and Wald test p value will be for the last variable in the design formula, and if this is a factor, the comparison will be the last level of this variable over the reference level".

Can anyone confirm me that this is correct?

Thank you so much, Bine

The question is if I put it correctly in my formula so that i will adjust for batch, Gender, Age, BMI and do the Differential expression for Vitamin D Group?

Yes, this is how we correct for multiple terms in linear models.

However, I will note that if you don't have a balanced design and you throw lots of variables in the design, you can have diminished power. But that has nothing to do with DESeq2.

Thank you for your answer.