Question

DESeq results names when using interaction terms

1

Entering edit mode

akaever ▴ 30

@akaever-7380

Last seen 9.1 years ago

Hi,

I am trying to extract results from a DESeqDataSet object based on a customized contrast list. Depending on whether the design contains an interaction term, the effect/result names change, which makes it quite difficult to extract results in an automatic fashion (e.g. as part of a script or app, without checking resultsNames() manually). Here's an example based on ?results:

library(DESeq2)

dds <- makeExampleDESeqDataSet(n=100,m=24)
dds$genotype <- factor(rep(rep(c("I","II","III"),each=4),2))

design(dds) <- ~ genotype + condition
dds <- DESeq(dds)
resultsNames(dds)
# [1] "Intercept"   "genotypeI"   "genotypeII"  "genotypeIII" "conditionA" "conditionB"

# genotypeI compared to average of genotypeII and genotypeIII
results(dds, contrast=list("genotypeI", c("genotypeII", "genotypeIII")),
        listValues=c(1, -0.5))

# works!

However, when adding an interaction term, pairwise effects (_vs_) instead of individual effects are extracted, which makes it difficult to extract results for individual levels or customized contrasts:

design(dds) <- ~ genotype + condition + genotype:condition
dds <- DESeq(dds)
resultsNames(dds)
# [1] "Intercept" "genotype_II_vs_I" "genotype_III_vs_I" "condition_B_vs_A" "genotypeII.conditionB" "genotypeIII.conditionB"

results(dds, contrast=list("genotypeI", c("genotypeII", "genotypeIII")),
        listValues=c(1, -0.5))

# error

Additionally, there seems to be no way to extract results for e.g. genotypeI.conditionA vs ...

DESeq(dds, modelMatrixType="expanded") requires a beta prior, but betaPrior=TRUE results in: Error in designAndArgChecker(object, betaPrior) : betaPrior=FALSE should be used for designs with interactions

Is there a way to directly influence which effects are calculated (e.g. genotypeI and genotypeII instead of genotype_II_vs_I, or genotypeII.conditionA instead of genotypeII.conditionB)? (For cases where resultsNames(dds) are not checked manually)

Thanks.

DESeq deseq2 resultsnames • 5.0k views

ADD COMMENT • link 9.6 years ago akaever ▴ 30

score 1 · Answer 1 · 2016-06-28

Hi,

Since version 1.10 (we are now on version 1.12), we changed the way that designs with interactions are modeled in DESeq2, as you have noticed. (This change and similar ones are in the NEWS file of the package, if you ever want to see what significant changes are going on.)

There's a good reason for this change, because our users were often mis-interpreting the meaning of the terms in a design with interaction when we used the shrinkage of LFC (betaPrior=TRUE).

So for these designs, we insist that the standard terms produced by R's model.matrix function be used.

If you want consistency between designs with and without interactions terms, you could always use betaPrior=FALSE. This is up to you. We find that it is useful for visualization and interpretation of fold changes, but if you care mostly about adjusted p-values, these are valid either way (they may change slightly but often not much at all).

So, then, the next question is, how to properly interpret the standard terms produced by R's model.matrix for a design with interactions? As you are using the same names and levels of factors, it looks like you found the section in the vignette which diagrams how to interpret interaction terms.

If you want to compare an average of genotype II and III vs genotype I for condition A you would do:

results(dds, contrast=list(c("genotype_III_vs_I","genotype_II_vs_I")),
  listValues=c(.5,1))

This is just the average of the III vs I and the II vs I effect, which is comparable to what you did the first time, except because you have specified a design with interactions, you are requesting to have condition-specific genotype comparisons.

For condition B, the comparison would be:

results(dds, contrast=list(c("genotype_III_vs_I","genotype_II_vs_I",
  "genotypeIII.conditionB","genotypeII.conditionB")),
  listValues=c(.5,1))

score 0 · Answer 2 · 2016-07-18

0

Entering edit mode

akaever ▴ 30

@akaever-7380

Last seen 9.1 years ago

Hi,

What would be the easiest way to extract the expected resultsNames before calling DESeq(dds) (which may take considerable time)?

There seems to be some name conversions between colnames(model.matrix(design(dds), colData(dds))) and resultsNames(dds).

ADD COMMENT • link 9.6 years ago akaever ▴ 30

0

Entering edit mode

I don't at this time have a convenience function to show the results names but you can run it on a few rows e.g. 10, if you want to see in advance. For coefficients that are built as comparisons to reference level (as in designs without the beta prior) we add that reference level in the result name for clarity to end users, for example condition_B_vs_A.

Actually, the rules are pretty simple:

For designs with betaPrior, you get the same resultsNames as you would get with ~0 + condition, e.g. conditionB, except that in the designs with betaPrior we also have an intercept. Likewise, obviously for the designs which turn off beta prior but have ~0 + condition you get the standard terms you would get with model.matrix, e.g conditionB.

For the designs without beta prior, but which have an intercept term, you get the condition_B_vs_A style of resultsNames, to emphasize that the coefficient is built as a comparison against the reference level. Interaction terms force the beta prior off, so you get these style of terms for the main effects. The interaction terms use the default style from model.matrix, e.g. conditionB.genotypeII.