Question

DESeq2 (Different results when using interactive term equation)

0

Entering edit mode

LR0306 ▴ 10

@lr0306-13464

Last seen 6.6 years ago

Hello all,

I am trying to use DESeq2 to determine the number of main effect DEGs and interaction DEGs in a two-factor experiment, where each factor has 2 levels.

###################################################

Basically, for my coldata object below,

> coldata

disease vitamin treatment

NC.1 N C NC

NC.2 N C NC

NC.3 N C NC

NR.1 N R NR

NR.2 N R NR

NR.3 N R NR

VC.1 V C VC

VC.2 V C VC

VC.3 V C VC

VR.1 V R VR

VR.2 V R VR

VR.3 V R VR

I get the same results no matter which of the below three codes I use below:

dds = DESeqDataSetFromMatrix(countData = data, colData = coldata, design = ~ disease*vitamin)

dds = DESeqDataSetFromMatrix(countData = data, colData = coldata, design = ~ disease + vitamin + disease*vitamin)

dds = DESeqDataSetFromMatrix(countData = data, colData = coldata, design = ~ disease + vitamin + disease:vitamin)

disease_V_vs_N has 0 DEGs

vitamin_R_vs_C has 941 DEGs

diseaseV.vitaminR has 0 DEGs

###################################################

However, for that same coldata object, if I do the following code:

dds = DESeqDataSetFromMatrix(countData = data, colData = coldata, design = ~ disease + vitamin)

I get different values for the effects (and I get no interactive effects due to the model used):

disease_V_vs_N has 34 DEGs

vitamin_R_vs_C has 1919 DEGs

###################################################

Finally, if I rename the rows of the coldata object as follows:

> coldata

disease vitamin treatment

N.1 N N

N.2 N N

N.3 N N

N.4 N N

N.5 N N

N.6 N N

V.1 V V

V.2 V V

V.3 V V

V.4 V V

V.5 V V

V.6 V V

And run the following code:

dds = DESeqDataSetFromMatrix(countData = data, colData = coldata, design = ~ treatment)

I know get the following:

disease_V_vs_N has 43 DEGs

###################################################

I am hoping to obtain some DEGs to look at for the disease_V_vs_N comparison, so I like the last method. However, I also need to obtain the other main effect an interactive term, which I can only do with the first method.

My questions are:

1) Is the inconsistency across the methods surprising? For example: disease_V_vs_N comparison has 0 DEGs, 34 DEGs, and 43 DEGs across the methods. Similarly, vitamin_R_vs_C has 941 DEGs and 1919 DEGs across the methods.

2) Is there another method to test for interaction that might reveal a different number of DEGs than the count of 0 I obtain in this case?

3) How (in)appropriate would it be for me to use the last method for my disease_V_vs_N comparison (mostly because I am hoping to have at least a handful of DEGs) and do the same procedure for the vitamin_R_vs_C comparison, but use the first method for my interactive term?

Thank you for sharing your thoughts on this matter.

DESeq2 • 506 views

ADD COMMENT • link updated 6.6 years ago by Michael Love 43k • written 6.6 years ago by LR0306 ▴ 10

score 0 · Answer 1 · 2018-04-10

Q1: these are totally different designs with different interpretations. Take a look at the section of the vignette on interactions and otherwise I strongly recommend you meet with a statistician to discuss the proper design for your experiment if you don’t follow the difference between these after reading the material in our vignette.

Q2: No

Q3: You should pick a design based on what you want to test — so on the proper meaning of the coefficients and controlling for the right variables — and definitely not based on looking at how many genes are rejecting the null.