Hi
I have raw count data file and its meta data file that looks like below
sample Batch Trt Status Age Sex BMI
S1 1 D R 33 M 25
S2 2 D NR 38 F 32
S3 1 D R 46 M 29
S4 1 D R 21 F 36
S5 2 P NR 33 F 26
S6 2 P NR 78 F 22
S7 1 P NR 28 M 34
S8 2 P R 47 M 24
Essentially, these are hundreds of sample.
I aim to identify differentially expressed genes in DR vs DNR.
However, I also need to control for covariates. Therefore what I am doing is :
dds=(design= ~Batch + Age+Sex+BMI + Status + BMI:Status)
dds=DESeq(dds,test="LRT", reduced=~Age+Sex+BMI+Batch)
then this is followed by
results(dds, contrast = c("STATUS","DNR","R"))
My question, is this the right way to go?
Note: with interaction I get atleast 200 genes at FC2 and FDR0.1 without interaction with BMI it falls to 27 genes that are pseudogenes or genes that make no sense.
Thank you for your time.