Question: NOOB question: How to use contrasts for subgroup analysis in DESeq2?
gravatar for ariel
9 months ago by
ariel0 wrote:

I'm working on a case/control study and including some other variables such as genotypes. One thing I want to do is a subgroup analysis of the genotypes for just the case group. The brute force method is to subset my data to the case group and run deseq on that. I'm very new to deseq and the very idea of contrasts. But from reading it seems like contrasts offer a more elegant way to do subgroup analysis without having to rerun deseq multiple times on differently subsetted data.

Suppose I have the following variables:

  • Case: [0,1]
  • Gene1: [YY, YN, NN]
  • Gene2: [YY, YN, NN]

I run deseq with the design "~ Case + Gene1 + Gene2 + Gene1:Gene2"

How would I write contrasts that would be the equivalent of

  1. Subgroup just those with the disease (Case=1)
  2. Subgroup just homozygous genotypes: (Gene1 = [YY, NN], Gene2=[YY,NN])
  3. Ask the question: Is the strongest effect due to Gene1, Gene2, or the interaction?


Here is a putative study design to work with:

> fake_study_design = data.frame(
+     Case=sample(c('Case', 'Control'), 10, replace=T), 
+     GeneA=sample(c('YY','YN','NN'), 10, replace=T), 
+     GeneB=sample(c('YY', 'YN', 'NN'), 10, replace=T)
+ )
> fake_study_design
      Case GeneA GeneB
1     Case    YY    YY
2     Case    YN    NN
3  Control    YY    NN
4  Control    YY    YN
5  Control    YN    YY
6  Control    NN    NN
7  Control    NN    YY
8     Case    NN    YN
9  Control    YY    NN
10 Control    YY    YY
deseq deseq2 contrast subgroup • 238 views
ADD COMMENTlink modified 9 months ago • written 9 months ago by ariel0
Answer: NOOB question: How to use contrasts for subgroup analysis in DESeq2?
gravatar for Michael Love
9 months ago by
Michael Love26k
United States
Michael Love26k wrote:

I can't follow, can you show the sample table, or an example of it? E.g.:

sample case gene1 gene2 sample1 0 YY YY ...

ADD COMMENTlink written 9 months ago by Michael Love26k

Added to post. Is this what you wanted?

ADD REPLYlink written 9 months ago by ariel0

Yes, so now I have some sense of the design (I'm not sure if you have all the required samples though).

You can combine the variables if you want to make comparisons just within one group, see here:

ADD REPLYlink written 8 months ago by Michael Love26k

The problem is, I'm so clueless about contrasts, I don't know how to take the vignette example and put it in practice. For instance, if my design was ~ Case + GeneA + GeneB + GeneA:GeneB, then to subgroup just Case=="Case", my contrast would be contrast=c(?,?,?).

ADD REPLYlink written 8 months ago by ariel0

I’d recommend chatting with a statistician or bioinformatician who works with R designs. They can help guide you through the missing pieces.

ADD REPLYlink written 8 months ago by Michael Love26k

Ok. Defying good community practice, I've cross-posted here to see if a stats person can help.

ADD REPLYlink written 8 months ago by ariel0

Hi @Michael--I thought of another way to phrase the question that might be more direct:

If I have created a deseq object with design ~CaseString + Gender, can I still access results for a subset of the data? For instance, can I create a contrast that would be equivalent to first subset data for Case=="Case", then run deseq?

ADD REPLYlink modified 8 months ago • written 8 months ago by ariel0

See the section on interactions. That is what interactions do, and we have a diagram to explain how to interpret the coefficients in the vignette. But more generally, you want to make sure you get it right, so if it remains unclear after reading the documentation, I strongly recommend to find a local statistician who you can sit down with and have them explain the interpretation of the coefficients in your linear model of interest. There isn't anything specific about DESeq2 here, we use the same coefficients as any linear model in R or otherwise.

ADD REPLYlink written 8 months ago by Michael Love26k

Thanks, Micheal! Your reply speed is absolutely stunning. I have made arrangements with a statistician :) I did read the section on interactions, but I wasn't sure if I really have an interaction here. I'm far from an expert in R formulas, but my understanding is that something like Case:Gender would mean that Case and Gender are acting together, as opposed to looking at Gender controlled for Case but only for Case="Case". Maybe those are the same thing. Statistician should help sort this out.

ADD REPLYlink written 8 months ago by ariel0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 298 users visited in the last hour