How to block for a subject in a 2 factor design formula
1
0
Entering edit mode
Onyi Ukay • 0
@onyi-ukay-17718
Last seen 4.9 years ago

My actual dataset is larger but here's a snapshot of what it looks like: 

Subject | Cell_type | Condition
---------|------------|------------
1           | A              | normal
1           | B              | normal
2           | A              | diseased
2           | B              | diseased
3           | A              | normal
3           | B              | normal

I would like to find differentially expressed genes in the following comparisons:
   i. A.normal vs A.diseased
   ii. B.normal vs B.diseased
   iii. normal.A vs normal.B
   iv. diseased.A vs diseased.B

I was wondering how can I make an interactive formula for Cell_type and Condition in the design formula, while blocking for the Subject?

My idea was to create another column in the data.frame called group:

Subject | Cell_type | Condition | Group
---------|------------|------------|----------
1           | A              | normal    | A.normal
1           | B              | normal    | B.normal
2           | A              | diseased | A.diseased
2           | B              | diseased | B.diseased
3           | A              | normal    | A.normal
3           | B              | normal    | B.normal

then use the formula (~ Subject + Group) but that also doesn't work. What's a work around this?

deseq2 differential gene expression rnaseq • 2.9k views
ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 1 day ago
United States

Subject is nested within condition. You can fit condition-specific cell-type differences controlling for subject, and you can contrast those cell-type differences across cell-type as well, using this approach:

http://master.bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#group-specific-condition-effects-individuals-nested-within-groups

Note that you can't control for subject using fixed effects and contrast directly across condition, because subject and condition are confounded. The above approach works because you make within-subject comparisons (cell-type B vs A), which can be assessed for each condition group or compared across condition group.

ADD COMMENT
0
Entering edit mode

Michael, thank you for your insightful response. I have some additional questions.

Suppose my references are Subject1, Cell_typeA, and Conditionnormal.

I tried the method and extracted the term for one of four comparisons of interest: contrast=list("Conditionnormal.Cell_typeB","Conditiondiseased.Cell_typeB")

As explained in ?results, I understand that this code tests if the difference between the diseased and normal is attributed to Cell_type. However, how do I get the contrast Conditionnormal.Cell_typeB_vs_Conditiondiseased.Cell_typeB. I see there's a term "Condition_diseased_vs_normal" but that's for the reference Subject1, isn't it?

Also how do I extract the other three contrasts: 

   i. A.normal vs A.diseased, ie., "Conditionnormal.Cell_typeA" -  "Conditiondiseased.Cell_typeA"
   ii. normal.A vs normal.B, ie., "Conditionnormal.Cell_typeA" -  "Conditionnormal.Cell_typeB"
   iii. diseased.A vs diseased.B, ie., "Conditiondiseased.Cell_typeA" -  "Conditiondiseased.Cell_typeB"

To state more precisely, how do I extract the within-subject comparisons (cell-type B vs A)? When I get the term "Conditionnormal.Subject3", is that equivalent to within Conditionnormal.Subject3, the comparison  Cell_type_B_vs_A or is it the term to be added to the reference contrast condition to account for the Conditionnormal.Subject3 effect?

ADD REPLY
1
Entering edit mode

As I said above, you cannot compare normal vs diseased within a cell type and control for subject using a fixed effects model, because those are confounded. The model cannot be fit meaningfully.

The only approach I know of, if you need to compare directly across condition, and you want to account for subject in the model, is to account for subject correlations using duplicateCorrelation() in the limma-voom framework.

ADD REPLY
0
Entering edit mode

I appreciate the help!

ADD REPLY
0
Entering edit mode

For the part of my analysis that does not directly handle confounded Subjects and Condition, what if I took this approach:

Take two subsets, each of Conditionnormal and Conditiondiseased, then run DESeq with the design = ~ Subject + Cell_type on each subset.

Would you recommend such an approach that does two separate analysis?

ADD REPLY
0
Entering edit mode

You can compare cell-type using the entire dataset, you don't need to split into two subsets. My first post said "You can fit condition-specific cell-type differences controlling for subject, and you can contrast those cell-type differences across cell-type as well" and then I linked to the section of the documentation where we show how to do this.

ADD REPLY

Login before adding your answer.

Traffic: 683 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6