DESeq2 design formula with multiple factors
1
1
Entering edit mode
Helena ▴ 10
@helena-23852
Last seen 3.8 years ago

Hi all,

I have three RNA-Seq datasets to perform DESeq2 differential expression analysis. My main interest is to

  1. Compare the treatment to control and see if there are differences between sex
  2. Compare the treatment to control, and control for sex and batch effect
  3. Compare the treatment to control and also the difference between genotypes, controlling for sex and batch effect

The corresponding datasets from the simplest to complex are:

First,

subject    sex    condition
      1      M    treatment
      1      M      control
      2      M    treatment
      2      M      control
      3      M    treatment
      3      M      control
      4      F    treatment
      4      F      control

Second (subject 1-4 are the same as in the first dataset),

subject    sex    condition    batch
      1      M    treatment        C
      1      M      control        C
      2      M    treatment        C
      2      M      control        C
      3      M    treatment        C
      3      M      control        C
      4      F    treatment        C
      4      F      control        C
      5      M    treatment        B
      5      M      control        B
      6      M    treatment        B
      6      M      control        B
      7      F    treatment        A
      7      F      control        A
      8      F    treatment        A
      8      F      control        A

Third (subject 1-8 are the same as in the second dataset),

subject    sex    condition    batch    genotype
      1      M    treatment        C           X
      1      M      control        C           X
      2      M    treatment        C           X
      2      M      control        C           X
      3      M    treatment        C           X
      3      M      control        C           X
      4      F    treatment        C           X
      4      F      control        C           X
      5      M    treatment        B           X
      5      M      control        B           X
      6      M    treatment        B           X
      6      M      control        B           X
      7      F    treatment        A           X
      7      F      control        A           X
      8      F    treatment        A           X
      8      F      control        A           X
      9      M    treatment        B           Y
      9      M      control        B           Y
     10      M    treatment        B           Y
     10      M      control        B           Y
     11      M    treatment        B           Y
     11      M      control        B           Y

At the beginning, I did not use paired-sample design and all the following can work:

  1. design = ~ sex + condition
  2. design = ~ sex + condition + batch
  3. design = ~ sex + condition + batch + genotype + genotype:condition

However, when considering the paired sample (each subject contain two conditions), the following all return an error which the model matrix is not full rank:

  1. design = ~ subject + sex + condition
  2. design = ~ subject + sex + condition + batch
  3. design = ~ subject + sex + condition + batch + genotype + genotype:condition

Can someone guide me how to make correct designs? Any suggestions or recommended statistical reading would be appreciated. Thanks!

deseq2 RNA-Seq • 5.1k views
ADD COMMENT
2
Entering edit mode
@mikelove
Last seen 1 day ago
United States

For choosing a statistical design for your analysis, I would recommend collaborating with a local statistician. It’s a really critical part of the analysis of complex datasets, and you want to make sure you understand the interpretation of results.

ADD COMMENT
0
Entering edit mode

Yes, Michael, you are right. I discussed these questions with my supervisor and we think I should simplify my statistical design.

ADD REPLY

Login before adding your answer.

Traffic: 949 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6