DESeq2 accounting for two confounders with interaction in the design matrix
1
0
Entering edit mode
@mohamedrefaat11992-22243
Last seen 16 months ago
Austria

Hi all!

I am analyzing RNA-seq data with DESeq2, and I have two confounding variables. One is the batch number and the other is whether cells were washed or not.

The data is comprised of samples of two different cell-lines that are un/treated with dox and un/modified with Luc/pax5/pax5-ita gene. These samples come from two batches, where the first contain only one type of cell line(NAM6) and the second batch contains two(NAM6/MHHCAL_2). This an abridged version of the metadata table

cohort cell_line mod treatment
2 MHHCALL2 Luc ctr
2 MHHCALL2 Luc dox
2 MHHCALL2 P5 ctr
2 MHHCALL2 P5 dox
2 MHHCALL2 P5X ctr
2 MHHCALL2 P5X dox
2 NALM6 Luc ctr
2 NALM6 Luc dox
2 NALM6 P5 ctr
2 NALM6 P5 dox
2 NALM6 P5X ctr
2 NALM6 P5X dox
1 NALM6 Luc ctr
1 NALM6 P5 ctr
1 NALM6 P5X ctr
1 NALM6 Luc dox
1 NALM6 P5 dox
1 NALM6 P5X dox

To study the effect of different combinations of modifications and treatments in the cell lines, I modified the table as follows

cohort cell_line celllinecohort mod treatment samplegroupsimple
2 MHHCALL2 MHHCALL2_2 Luc ctr Luc_ctr
2 MHHCALL2 MHHCALL2_2 Luc dox Luc_dox
2 MHHCALL2 MHHCALL2_2 P5 ctr P5_ctr
2 MHHCALL2 MHHCALL2_2 P5 dox P5_dox
2 MHHCALL2 MHHCALL2_2 P5X ctr P5X_ctr
2 MHHCALL2 MHHCALL2_2 P5X dox P5X_dox
2 NALM6 NALM6_2 Luc ctr Luc_ctr
2 NALM6 NALM6_2 Luc dox Luc_dox
2 NALM6 NALM6_2 P5 ctr P5_ctr
2 NALM6 NALM6_2 P5 dox P5_dox
2 NALM6 NALM6_2 P5X ctr P5X_ctr
2 NALM6 NALM6_2 P5X dox P5X_dox
1 NALM6 NALM6_1 Luc ctr Luc_ctr
1 NALM6 NALM6_1 P5 ctr P5_ctr
1 NALM6 NALM6_1 P5X ctr P5X_ctr
1 NALM6 NALM6_1 Luc dox Luc_dox
1 NALM6 NALM6_1 P5 dox P5_dox
1 NALM6 NALM6_1 P5X dox P5X_dox

As you can see, I have combined the mod and treatment columns into one column called sample_group_simple. As well as combining the cohort and cell_line columns into cell_line_cohort column. Finally, the following design for the analysis.

~ cell_line_cohort + cell_line_cohort:sample_group_simple

Unfortunately, the complex enough situation got more complicated when we found out that only a subset of samples have been washed by PBS. This variable is confounding the batch variable since all samples of the first batch have been washed, unlike the second one. As a result, any attempt to account for it in the analysis design leads to a not-full-rank model matrix. The final table looks like this

PBS cohort cell_line celllinecohort mod treatment samplegroupsimple
wash 2 MHHCALL2 MHHCALL2_2 Luc ctr Luc_ctr
wash 2 MHHCALL2 MHHCALL2_2 Luc dox Luc_dox
wash 2 MHHCALL2 MHHCALL2_2 P5 ctr P5_ctr
wash 2 MHHCALL2 MHHCALL2_2 P5 dox P5_dox
wash 2 MHHCALL2 MHHCALL2_2 P5X ctr P5X_ctr
wash 2 MHHCALL2 MHHCALL2_2 P5X dox P5X_dox
no_wash 2 NALM6 NALM6_2 Luc ctr Luc_ctr
no_wash 2 NALM6 NALM6_2 Luc dox Luc_dox
no_wash 2 NALM6 NALM6_2 P5 ctr P5_ctr
no_wash 2 NALM6 NALM6_2 P5 dox P5_dox
no_wash 2 NALM6 NALM6_2 P5X ctr P5X_ctr
no_wash 2 NALM6 NALM6_2 P5X dox P5X_dox
wash 1 NALM6 NALM6_1 Luc ctr Luc_ctr
wash 1 NALM6 NALM6_1 P5 ctr P5_ctr
wash 1 NALM6 NALM6_1 P5X ctr P5X_ctr
wash 1 NALM6 NALM6_1 Luc dox Luc_dox
wash 1 NALM6 NALM6_1 P5 dox P5_dox
wash 1 NALM6 NALM6_1 P5X dox P5X_dox

My question is how can I test for the effect of different combinations of treatment and modifications on different cell lines, while accounting for the two confounders, namely PBS and cell_line_cohort?

Thanks in advance, Mohamed

deseq2 design matrix multi-confounders linear models rna-seq • 1.1k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 13 minutes ago
United States

If a nuisance variable is confounded with another nuisance variable, you can just combine the two:

nuisance <- factor(paste(nuisance1, nuisance2))

And then just use this one variable.

ADD COMMENT
0
Entering edit mode

Thanks for the prompt reply, Micheal!

This means that I should do the following: 1) Combine cell_line, cohort, and PBS into one variable cell_line_cohort_PBS <- cell_line + cohort + PBS 2) Use the created variable without modifying the inital design formula. cell_line_cohort_PBS + cell_line_cohort_PBS :sample_group_simple Am I right?

ADD REPLY
0
Entering edit mode

Sorry, I missed the fact that the confounding is with a condition of interest not another nuisance variable. In that case you can't really control for the nuisance variables.

ADD REPLY

Login before adding your answer.

Traffic: 876 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6