DESeq2 logfoldchange and interaction in model
1
0
Entering edit mode
@pierremarin-17659
Last seen 18 months ago

Hello ! PhD student and working with rna-seq data, I used DESeq2 for my analysis but after finishing my work, I believe some elements were misunderstood when using the software. To explain I summarized my model analysis: 3 genotypes (G1,G2,G3) in 3 conditions (T1,T2,T3) and my model: I'm interesting by the effect of genotype and treatment but especially by the interaction term in my work. So the design appropriated seem to be:

design = formula(~G + B + G:B)


Reference factors are : G1 and T1

intercept  : base mean of G1 in T1 (reference)
G2vsG1   : Genotype 2 against reference G1 in T1
G3vsG1   : Genotype 3 against reference G1 in T1
T2vsT1   : treatment 2 against treatment 1 in T1
T3vsT1   : treatment 3 against treatment 1 in T1
G2:T2     : treatment 2 effect in G2
G3:T2     : treatment 3 effect in G2
G2:T3     : treatment 2 effect in G3
G3:T3     : treatment 3 effect in G3


The model built with every parameter compared to the references. following the DESeq2 vignette I built the contrast test for example :

effect of the T2 compare to the T1 for G1 (treatment 2 effect on the genotype 1) :

results(deseq.object,contrast=list(c("T2vsT1")))


effect of the T2 for G2 compared to G1 (treatment 2 effect on the genotype 2):

results(deseq.object,contrast=list(c("T2vsT1","G2:T2")))


effect of the T2 for G2 against G3 (treatment 2 effect and genotype 2 vs genotype 3 effect = genotype environment interaction):

results(deseq.object,contrast=list(c("G2:T2","G3:T2")))


If I understood, for the last example, I tested the GxE effect in the treatment 2 corrected by the treatment 1 (the reference) and compared between 2 genotypes (typically as in a linear model in R).

After that, I have some contrast files, especially with log2foldchange but I didn't understand the underlying comparison. If I want to test the effect in a genotype of the treatment 2 against 1 it's the ratio between the two treatment (with the fact that there is some modifications by deseq with shrink etc. not simply the ratio but the idea is here). But the logfoldchange for my last example when I test GxE is not clear and Deseq2 help and forum did'nt gave me an information about. Is it a ration between genotype in the treatment 2 after correction for the treatment 1 ? a ratio of ratio ?

deseq2 R logfoldchange rna-seq • 1.9k views
0
Entering edit mode

Hello ,

Thank you very much because I understood first I made a mistake because in my model I have interaction and I didn't use the shrinkage function with lfcShrink which seem to be necessary in my case to avoid expression level mistake.

0
Entering edit mode

Hello,

Just thank you I discussed with a statistician collegue and I understand you answers !

So log2foldchange in interaction term for B against A in env 2 is:

Log2( (popB.env2/popB.env1) / (popA.env2/popA.env1) )


Thank's !

Juste a question about coding interaction, in the results vignette

# the interaction term for condition effect in genotype III vs genotype I.
# this tests if the condition effect is different in III compared to I
results(dds, name="genotypeIII.conditionB")


It's exactely the same if I wrote with numeric vector with only +1 for genotypeIII.conditionB

results(dds, contrast=c(0,0,0,0,0,+1))


But you also said:

# the interaction term for condition effect in genotype III vs genotype II.
# this tests if the condition effect is different in III compared to II
results(dds, contrast=list("genotypeIII.conditionB", "genotypeII.conditionB"))


If you code that, you add the first parameter with the second like that:

results(dds, contrast=c(0,0,0,0,+1,+1)


But if I want the difference between III and II I believed to write:

results(dds, contrast=c(0,0,0,0,+1,-1)


I'm not sure of the last contrast test.

1
Entering edit mode

When you provide a list to contrast, the first element of the list gets the +1 and the second element of the list gets a -1. You can have multiple elements in the first element and/or multiple in the second element, e.g. list( c(...), c(...) ). Then the coefficients you name in the first c(...) will get +1 and likewise the coefficients in the second c(...) get -1.

0
Entering edit mode

Hi,

I have been struggling for some time with the same problem, making a contrast between effects in two conditions with a code like the above, "results(dds, contrast=list("genotypeIII.conditionB", "genotypeII.conditionB")). Unfortunately, I'm not sure I understand the answer.

I would expect the code to test a subtraction of the two interactions, that is similar to results(dds, contrast=c(0,0,0,0,+1,-1). However, the executed code is an addition, identical to results(dds, contrast=c(0,0,0,0,+1,+1).

Is this really correct?, and in that case is there a way to specify contrast=c(0,0,0,0,+1,-1) using names?

0
Entering edit mode

That's not correct. It performs an addition if you do list( c("x", "y") ), but not if you do list("x", "y").

0
Entering edit mode

Thank you very much for the quick reply.

You are absolutely right, I had overlooked the important difference between list( c("x", "y") ) and list("x", "y").

It all makes sense now.

2
Entering edit mode
swbarnes2 ★ 1.1k
@swbarnes2-14086
Last seen 3 hours ago
San Diego

For interaction tests, the column is labeled "LogFoldChange", but it really is a ratio of fold changes. You can eyeball this yourself by using the normalized counts, and working out the average values of the different treatments, and comparing their ratios. The number will not quite match what DESeq outputs, because DESeq is taking into account more things, but it is likely to be close.

0
Entering edit mode

If I understand you, Example with the geneX: Table of normalized mean count for the geneX. G is genotype and T is treatment

 G  T         mean
G1 T1    0.4578779
G2 T1  203.3252143
G3 T1 1450.3183031
G1 T2  143.0287862
G2 T2    0.0000000
G3 T2 3689.9999219
G1 T3  495.4010815
G2 T3    2.6363624
G3 T3 3625.5900077


In my contrast test for the difference between G2 and G3 in T2 I wrote :

results(deseq.object,contrast=list(c("G2:T2","G3:T2")))


The output is :

           baseMean log2FoldChange        lfcSE         stat          pvalue            padj
geneX    1067.86195    -24.5313257    3.6579721    -6.706264    1.996708e-11    1.112987e-07


This gene is significantly modified during the treatment T2 between G1 and G2 and the log2foldchange = -24.5

using normalized count I tried it:

log2(G2_T2/G3_T2)= inf


But how to deal with 0, I imagine DESeq2 make an approximation of 0? And how to deal with the T1 which represent control condition for example?

1
Entering edit mode

Here the fold change is not finite and DESeq2 for infinite coefficients will stop once it passes a very large coefficient value. Note that we’ve spent a lot of time working on incorporating shrinkage models to give posterior inference. You could directly use ashr here, or apeglm but with a little more work (see vignette extended section on shrinkage).

0
Entering edit mode

Hello,

Thank you for the response, I understand better how DESeq2 work. But it remain unclear for me what is taking acount for the folchange calcul when I make a contrast between interactive terms as in my last example.

1
Entering edit mode

It’s the ratio of fold changes as swbarnes said above. For further understanding of how interactions work, and contrasts of interactions, beyond the documentation we have in ?results and in the vignette, and if this is your first time analyzing a linear mode with interaction terms, I’d recommend to meet with a local statistician who can explain these.