How to compare the design formulas of DESeq2?
1
0
Entering edit mode
analeigh.gui ▴ 20
@analeighgui-14556
Last seen 4.8 years ago

Dear all,

I have the samples from two mouse strains in both genders.

Strain Gender
AJ M
AJ F
BL6 M
BL6 F

The design formulas I have are:

formula 1: ~ strain # test the effect of different strains 

formula 2: ~ strain + gender # test for the effect of gender controlling for the effect of different mouse strains

formula 3: ~ strain + gender + stain:gender # test which genes the effect of gender is different across different mouse strains

Here the problem is, I'm not sure if gender has an effect on the gene expression.

So I want to test if it's meaningful by adding gender and the interaction terms in the design formula, as the results vary a lot with different formulas.

I'm not familiar with statistics. Could anyone give me a hint on how to test it?

Thanks a lot!

Best,

Yujuan Gui

deseq2 • 2.2k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 2 hours ago
United States

You don't have enough replicates to have an interaction term. You can compare formula 2 and formula 1 using a likelihood ratio test however:

https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#likelihood-ratio-test

ADD COMMENT
0
Entering edit mode

Thanks a lot! I have a follow-up question regarding to the replicate number for interaction.

We are also thinking about to increase the replicate number to 12 (6 female + 6 male) for each strain, to account for the individual difference. In that sense, is it the number of replicates enough to take the interaction into consideration?

Best,

ADD REPLY
0
Entering edit mode

2 mice in each group is enough to allow you estimate interaction with some degrees of freedom left to assess statistical significance (well, you could even get away with 2 mice in just one of the groups, but I wouldn't recommend that!)  The question now becomes one of power to detect any changes, and within reason the answer is, the more replicates the better!  Particularly if you're looking to settle the question of whether gender has an influence on expression: a poorly powered experiment will leave you still unable to answer this question, as a null result is interpreted as there being not enough evidence of change, rather than there being evidence of no change.

6 in each group seems a reasonable number, though.  You may want to do a PCA or clustering to gain some intuition as to much of an effect gender is having globally.  You may then want to try formula '2' to remove gender effects: looking at the size of the genelist given by results(dds, contrast=c("gender", "M", "F")) might give you enough confidence that if it's small, you can fall back to the even simpler formula '2'.  Alternatively, the clustering (or your actual hypothesis) might mean you want separate genelists for male-specific strain-differential genes and female specific, in which case you'd want to use formula 3. 

Assuming you want to find strain-dependent genes, I'd recommend formula 2, as you're only losing a bit of power compared to '1', but with the added benefit off finding genes that have a consistent effect (but different baseline) across the genders. But it's worth checking formula 3 which can give you gender-specific strain-dependent genes.  Depends on what your hypothesis is.

ADD REPLY

Login before adding your answer.

Traffic: 782 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6