NOOB question: How to use contrasts for subgroup analysis in DESeq2?
1
0
Entering edit mode
ariel ▴ 20
@ariel-16886
Last seen 2.9 years ago
United States

I'm working on a case/control study and including some other variables such as genotypes. One thing I want to do is a subgroup analysis of the genotypes for just the case group. The brute force method is to subset my data to the case group and run deseq on that. I'm very new to deseq and the very idea of contrasts. But from reading it seems like contrasts offer a more elegant way to do subgroup analysis without having to rerun deseq multiple times on differently subsetted data.

Suppose I have the following variables:

  • Case: [0,1]
  • Gene1: [YY, YN, NN]
  • Gene2: [YY, YN, NN]

I run deseq with the design "~ Case + Gene1 + Gene2 + Gene1:Gene2"

How would I write contrasts that would be the equivalent of

  1. Subgroup just those with the disease (Case=1)
  2. Subgroup just homozygous genotypes: (Gene1 = [YY, NN], Gene2=[YY,NN])
  3. Ask the question: Is the strongest effect due to Gene1, Gene2, or the interaction?

Thanks!

Here is a putative study design to work with:

> fake_study_design = data.frame(
+     Case=sample(c('Case', 'Control'), 10, replace=T), 
+     GeneA=sample(c('YY','YN','NN'), 10, replace=T), 
+     GeneB=sample(c('YY', 'YN', 'NN'), 10, replace=T)
+ )
> fake_study_design
      Case GeneA GeneB
1     Case    YY    YY
2     Case    YN    NN
3  Control    YY    NN
4  Control    YY    YN
5  Control    YN    YY
6  Control    NN    NN
7  Control    NN    YY
8     Case    NN    YN
9  Control    YY    NN
10 Control    YY    YY
deseq deseq2 contrast subgroup • 2.3k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 16 hours ago
United States

I can't follow, can you show the sample table, or an example of it? E.g.:

sample case gene1 gene2 sample1 0 YY YY ...

ADD COMMENT
0
Entering edit mode

Added to post. Is this what you wanted?

ADD REPLY
0
Entering edit mode

Yes, so now I have some sense of the design (I'm not sure if you have all the required samples though).

You can combine the variables if you want to make comparisons just within one group, see here:

https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#interactions

ADD REPLY
0
Entering edit mode

The problem is, I'm so clueless about contrasts, I don't know how to take the vignette example and put it in practice. For instance, if my design was ~ Case + GeneA + GeneB + GeneA:GeneB, then to subgroup just Case=="Case", my contrast would be contrast=c(?,?,?).

ADD REPLY
0
Entering edit mode

I’d recommend chatting with a statistician or bioinformatician who works with R designs. They can help guide you through the missing pieces.

ADD REPLY
0
Entering edit mode

Ok. Defying good community practice, I've cross-posted here to see if a stats person can help.

ADD REPLY
0
Entering edit mode

Hi @Michael--I thought of another way to phrase the question that might be more direct:

If I have created a deseq object with design ~CaseString + Gender, can I still access results for a subset of the data? For instance, can I create a contrast that would be equivalent to first subset data for Case=="Case", then run deseq?

ADD REPLY
0
Entering edit mode

See the section on interactions. That is what interactions do, and we have a diagram to explain how to interpret the coefficients in the vignette. But more generally, you want to make sure you get it right, so if it remains unclear after reading the documentation, I strongly recommend to find a local statistician who you can sit down with and have them explain the interpretation of the coefficients in your linear model of interest. There isn't anything specific about DESeq2 here, we use the same coefficients as any linear model in R or otherwise.

ADD REPLY
0
Entering edit mode

Thanks, Micheal! Your reply speed is absolutely stunning. I have made arrangements with a statistician :) I did read the section on interactions, but I wasn't sure if I really have an interaction here. I'm far from an expert in R formulas, but my understanding is that something like Case:Gender would mean that Case and Gender are acting together, as opposed to looking at Gender controlled for Case but only for Case="Case". Maybe those are the same thing. Statistician should help sort this out.

ADD REPLY

Login before adding your answer.

Traffic: 589 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6