Multi factor design and removing the effect of sex on main factor
1
0
Entering edit mode
Mahendra • 0
@2e799a6e
Last seen 14 months ago
United States

Hi Michael Love,

I am pretty new in this. I have "wt" and "ko" RNA-seq samples, each three replicates for differential gene expression analysis. sample description as follows:

wt group = 2 males and 1 female ko group = 3 female

I noticed that my samples are grouping based on "sex" more than "genotype". I do not have more mice to repeat this experiment but I want to make some sense out of it.

Could you please help me with how I can correct effect of sex in my analysis or how I can remove sex-based influence using DEseq analysis?

Thank you, Mahendra

DESeq2 • 808 views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 2 days ago
United States

The short answer is that you can't. You have almost completely aliased sex and KO status, and there is no method that will fix that situation. You have unfortunately learned a lesson the hard way, but hopefully in the future you will keep this in mind when doing a new experiment.

ADD COMMENT
0
Entering edit mode

Hi James,

Thanks for your reply. Does ANOVA solve this problem since it provides p-values for both comparisons- genotype and sex? Then I can remove genes those have significantly differentially expressed in sex comparison to just interpret differentially expressed genes in genotype comparison.

Thanks, Mahendra

ADD REPLY
0
Entering edit mode

No, ANOVA doesn't solve the problem. Your WT samples are all male, and all but one female is KO. If you fit a model like ~ genotype + sex, then you will be apportioning the variability into two buckets that are almost the same and are likely to lose any significant results. Here's an example

> set.seed(12345)
> d.f <- data.frame(sex = rep(c("male","female"), c(2,4)), gt = rep(c("wt","ko"), each = 3), vals = rnorm(6) + c(2,2,2,0,0,0))
> d.f
     sex gt       vals
1   male wt  2.5855288
2   male wt  2.7094660
3 female wt  1.8906967
4 female ko -0.4534972
5 female ko  0.6058875
6 female ko -1.8179560
> summary(lm(vals~gt + sex, d.f))

Call:
lm(formula = vals ~ gt + sex, data = d.f)

Residuals:
         1          2          3          4          5          6 
-6.197e-02  6.197e-02  1.388e-17  1.017e-01  1.161e+00 -1.263e+00 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)  -0.5552     0.5736  -0.968    0.404
gtwt          2.4459     1.1471   2.132    0.123
sexmale       0.7568     1.2167   0.622    0.578

Residual standard error: 0.9934 on 3 degrees of freedom
Multiple R-squared:  0.8195,    Adjusted R-squared:  0.6991 
F-statistic: 6.809 on 2 and 3 DF,  p-value: 0.07671

> summary(lm(vals~gt, d.f))

Call:
lm(formula = vals ~ gt, data = d.f)

Residuals:
      1       2       3       4       5       6 
 0.1903  0.3142 -0.5045  0.1017  1.1611 -1.2628 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)  
(Intercept)  -0.5552     0.5278  -1.052   0.3522  
gtwt          2.9504     0.7464   3.953   0.0168 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.9141 on 4 degrees of freedom
Multiple R-squared:  0.7962,    Adjusted R-squared:  0.7452 
F-statistic: 15.63 on 1 and 4 DF,  p-value: 0.01678

> summary(lm(vals~sex, d.f))

Call:
lm(formula = vals ~ sex, data = d.f)

Residuals:
       1        2        3        4        5        6 
-0.06197  0.06197  1.83441 -0.50978  0.54960 -1.87424 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)  
(Intercept)  0.05628    0.68225   0.082   0.9382  
sexmale      2.59121    1.18169   2.193   0.0934 .
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.365 on 4 degrees of freedom
Multiple R-squared:  0.5459,    Adjusted R-squared:  0.4324 
F-statistic: 4.808 on 1 and 4 DF,  p-value: 0.0934

Now let's make the females all really similar.

> d.f$vals[3] <- d.f$vals[3] - 1
> d.f
     sex gt       vals
1   male wt  2.5855288
2   male wt  2.7094660
3 female wt  0.8906967
4 female ko -0.4534972
5 female ko  0.6058875
6 female ko -1.8179560
> summary(lm(vals~gt + sex, d.f))

Call:
lm(formula = vals ~ gt + sex, data = d.f)

Residuals:
         1          2          3          4          5          6 
-6.197e-02  6.197e-02  1.735e-17  1.017e-01  1.161e+00 -1.263e+00 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)  -0.5552     0.5736  -0.968    0.404
gtwt          1.4459     1.1471   1.260    0.297
sexmale       1.7568     1.2167   1.444    0.244

Residual standard error: 0.9934 on 3 degrees of freedom
Multiple R-squared:  0.8064,    Adjusted R-squared:  0.6773 
F-statistic: 6.247 on 2 and 3 DF,  p-value: 0.08519

> summary(lm(vals~gt, d.f))

Call:
lm(formula = vals ~ gt, data = d.f)

Residuals:
      1       2       3       4       5       6 
 0.5236  0.6476 -1.1712  0.1017  1.1611 -1.2628 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)  
(Intercept)  -0.5552     0.6467  -0.859   0.4390  
gtwt          2.6171     0.9145   2.862   0.0459 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.12 on 4 degrees of freedom
Multiple R-squared:  0.6718,    Adjusted R-squared:  0.5898 
F-statistic: 8.189 on 1 and 4 DF,  p-value: 0.04585

> summary(lm(vals~sex, d.f))

Call:
lm(formula = vals ~ sex, data = d.f)

Residuals:
       1        2        3        4        5        6 
-0.06197  0.06197  1.08441 -0.25978  0.79960 -1.62424 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)  
(Intercept)  -0.1937     0.5320  -0.364   0.7342  
sexmale       2.8412     0.9215   3.083   0.0368 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.064 on 4 degrees of freedom
Multiple R-squared:  0.7039,    Adjusted R-squared:  0.6298 
F-statistic: 9.507 on 1 and 4 DF,  p-value: 0.03681

Because sex and genotype are almost completely aliased, each one separately looks significant, but together they are not. And it's not possible to say if the difference is due to the genotype or to sex differences, or some combination of the two.

ADD REPLY
0
Entering edit mode

That makes sense. Thank you so much for your detailed explanation.

ADD REPLY

Login before adding your answer.

Traffic: 302 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6