Obtain p-values for group * condition interaction using DESeq2?
1
2
Entering edit mode
@charlesfoster-17652
Last seen 4 months ago
Australia

Hi all,

Despite reading the DESeq2 vignette (and other useful references), I'm still a little uncertain about how to best test for differential expression in the presence of a group by condition interaction. I have two conditions (for simplicity here cond1 and cond2), and eight species (for simplicity here species1, species2, ..., species8). What I hope to do is look for genes that are differentially expressed genes between the two conditions in all species, avoiding genes where there is a species*condition interaction.

To explain the confusion, I'll start by talking about what I'd do when running GLMs with interactions using the glm.nb function (from MASS). In this situation, for each gene I would (i) run glm.nb; (ii) run anova on the results. The design for glm.nb with an interaction is Count ~ Species + Condition + Species:Condition.

Therefore, for each gene I would have results like this:

Any gene for which the Species:Condition interaction is significant (like in the screenshot above) cannot be included in my results, so I remove them from the results. I'm left with a results table for genes where there is a significant difference between cond1 and cond2, and where there is no species*condition interaction.

In DESeq2, I believe that if I had just two species, I could run an analysis including an interaction term like so:

dds <- DESeqDataSetFromMatrix(countData = counts, colData = samples, design = ~Species+Condition+Species:Condition)

dds <- DESeq(dds)

results_interaction <- results(dds, name="Speciesspecies2.Conditioncond2")

Note: reference level of "Species" is species1; reference level of "Condition" is cond1.

Question 1: Am I interpreting the output correctly that any gene with a significant pvalue in the results_interaction object I create above are those genes for which there is a significant interaction between Species and Condition? If so, are these genes analagous to those that have a significant interaction using functions like glm.nb, as in the screenshot example above?

Question 2: To get the genes differentially expressed between cond1 and cond2 in species1 (but without a Species:Condition interaction), do I do the following?

res1 <- results(dds, name="Condition_cond2_vs_cond1")

Question 3: Is there a way to directly test for genes that are differentially expressed in all species (i.e., don't have a Species:Condition interaction)? If not, do I have to do something like:

res1 <- results(dds, name="Condition_cond2_vs_cond1") #testing for DE without Species*Group interaction

res1 <- as.data.frame(res1)

res1 %>% filter(padj < 0.05) %>% arrange(padj) -> res1_DE

res1_DE <- rownames(res1_DE)

res2 <- results(dds, list(c("Condition_cond2_vs_cond1", "Speciesspecies2.Conditioncond2"))) #testing for DE without Species*Group interaction

res2 <- as.data.frame(res2)

res2 %>% filter(padj < 0.05) %>% arrange(padj) -> res2_DE

res2_DE <- rownames(res2_DE)

both_DE <- union(res1_DE, res2_DE)

Question 4: How do the answers to these questions scale up to situations where I am looking for a difference between two conditions in >2 species? For example, for one component of my actual study I need to look for genes that are differentially expressed between cond1 and cond2 in all eight species (or in a certain 4/8 species etc.).

In my actual study, the possible resultsNames look like so:

resultsNames(dds)

[1] "Intercept"
[2] "Speciesspecies2vsspecies1"
[3] "Species
species3vsspecies1"
[4] "Speciesspecies4vsspecies1"
[5] "Species
species5vsspecies1"
[6] "Speciesspecies6vsspecies1"
[7] "Species
species7vsspecies1"
[8] "Speciesspecies8vsspecies1"
[9] "Condition
cond2vscond1"
[10] "Speciesspecies2.Conditioncond2" [11] "Speciesspecies3.Conditioncond2" [12] "Speciesspecies4.Conditioncond2" [13] "Speciesspecies5.Conditioncond2" [14] "Speciesspecies6.Conditioncond2" [15] "Speciesspecies7.Conditioncond2" [16] "Speciesspecies8.Conditioncond2"

Thanks to anyone who can help with these questions!

DESeq2 interaction differential expression • 214 views
1
Entering edit mode
@mikelove
Last seen 2 hours ago
United States

Q1 "Am I interpreting the output correctly that any gene with a significant pvalue in the results_interaction object I create above are those genes for which there is a significant interaction between Species and Condition?"

Yes.

Q2: "To get the genes differentially expressed between cond1 and cond2 in species1 (but without a Species:Condition interaction), do I do the following?"

Genes with a significant main effect can also have a significant interaction. Gene 2 in the diagrams in the vignette is an example of this.

Q3: "Is there a way to directly test for genes that are differentially expressed in all species (i.e., don't have a Species:Condition interaction)? If not, do I have to do something like:"

If you want the interaction terms to be small, you need to define what is small. Then you could use lfcThrehsold with altHypothesis="lessAbs". And you could find the intersection of genes with small interaction terms.

Q4: "How do the answers to these questions scale up to situations where I am looking for a difference between two conditions in >2 species?"

The same as in base R: one species acts as the reference level and the others are compared to it. The interaction terms represent differences in the effect compared to the effect seen in the reference level. Actually this is exactly what is diagramed in the vignette in the Interactions section.