**0**wrote:

Hi all,

I am new to R scripting and bioinformatics in general. I'm using edgeR to run DE statistics on a project with 2 factors: diet (consisting of 4 diets: 12, 24, 30 ,39) and stress (consisting of 5 stressors: C, L, H, B, LB). I have 4 replicates per treatment (80 samples in total). I would like to figure out which genes are DE across the main effects of diet, stress, and across a diet*stress interaction. I'm not sure how to run a full model at once that would give me the genes for the main effects and interactions (as is done in an ANOVA), so I've been looking at each separately by creating model matrices that just specify diet, just stress, an additive diet+stress model, and then a full model with all combinations. I noticed when I run a glmFit on diet, stress, diet+stress, and the interaction the number of DE genes changes depending on what model matrix I reference (see below). There are few differences between the diet and stress separate models and the diet+stress model, but large difference between those and the full model.

2 questions: Does anyone know why the number of DE genes changes depending on the design matrix?

Can anyone tell me the appropriate way to determine the main diet and stress effects and the interaction in one model (I know how to do individual contrasts but I'm looking for main and interaction effects first)?

Thanks in advance!

carrie

ADDITIVE MODEL:

>design.add<-model.matrix(~diet+stress)

> glm.stressadd<-glmFit(Hz_dgelist_filtered, design.add)

> design.add

> lrt.stressadd<-glmLRT(glm.stressadd, coef=5:8)

> topTags(lrt.stressadd)

> FDR.stressadd<-p.adjust(lrt.stressadd$table$PValue, method="BH")

> sum(FDR.stressadd<0.05)

[1] **5053**

> glm.dietadd<-glmFit(Hz_dgelist_filtered, design.add)

> lrt.dietadd<-glmLRT(glm.dietadd, coef=2:4)

> topTags(lrt.dietadd)

> FDR.dietadd<-p.adjust(lrt.dietadd$table$PValue, method="BH")

> sum(FDR.dietadd<0.05)

[1] **2913**

DIET MODEL:

> glm.diet<-glmFit(Hz_dgelist_filtered, design.diet)

> lrt.diet<-glmLRT(glm.diet, coef=2:4)

> topTags(lrt.diet)

> FDR.diet<-p.adjust(lrt.diet$table$PValue, method="BH")

> sum(FDR.diet<0.05)

[1] **3225**

STRESS MODEL:

>design.stress<-model.matrix(~stress)

> glm.stress<-glmFit(Hz_dgelist_filtered, design.stress)

> lrt.stress<-glmLRT(glm.stress, coef=2:5)

> topTags(lrt.stress)

> FDR.stress<-p.adjust(lrt.stress$table$PValue, method="BH")

> sum(FDR.stress<0.05)

[1] **5371**

DIET/FULL MODEL:

> design.full<-model.matrix(~0+diet+stress+diet:stress)

> glm.dietfull<-glmFit(Hz_dgelist_filtered, design.full)

> lrt.dietfull<-glmLRT(glm.dietfull, coef=2:4)

> topTags(lrt.dietfull)

> FDR.dietfull<-p.adjust(lrt.dietfull$table$PValue, method="BH")

> sum(FDR.dietfull<0.05)

[1]** 10719**

STRESS/FULL MODEL:

> glm.stressfull<-glmFit(Hz_dgelist_filtered, design.full)

> lrt.stressfull<-glmLRT(glm.stressfull, coef=5:8)

> topTags(lrt.stressfull)

> FDR.stressfull<-p.adjust(lrt.stressfull$table$PValue, method="BH")

> sum(FDR.stressfull<0.05)

[1] **1043**

> glm.intfull<-glmFit(Hz_dgelist_filtered, design.full)

> lrt.intfull<-glmLRT(glm.intfull, coef=9:20)

> topTags(lrt.intfull)

> FDR.intfull<-p.adjust(lrt.intfull$table$PValue, method="BH")

> sum(FDR.intfull<0.05)

[1] **587**