DEseq2 design matrix for paired samples with batch effects
1
0
Entering edit mode
cylsae • 0
@cylsae-22516
Last seen 5.9 years ago

I have paired subjects from two different time points week1 and week2. In week1, half of the subjects are treated with condition 1 and half with condition 2. In week2, it is the other way around depending on the subjects' week1 treatment. A batch effect can be seen from the data between week1 and week2. I would like to use DESeq2 to perform DE analysis accounting for both batch and subject effect. Will a design = ~ batch + subject + condition makes sense ? If not, what will be the recommendation ? Thanks in advance.

deseq2 • 1.1k views
ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 5 days ago
United States

So you have two samples per patient? Condition 1 treated and condition 2 treated but not control?

And then there is the week variable splitting the patients in two.

ADD COMMENT
0
Entering edit mode

It is a clinical trial with crossover design (https://www.eupati.eu/clinical-development-and-trials/clinical-trial-designs/)

The two conditions are placebo treated (control) and drug treated.

Half of the subjects (randomly selected) were treated with placebo pills in visit 1 and treated with drug pills in visit 2. The other half of the subjects were treated with drug pills in visit 1 and treated with placebo in visit2.

I would like to perform paired-subject DE analysis to check drug effect. I also see batch effect between the two visits so the design shall take batch into consideration.

ADD REPLY
1
Entering edit mode

Ok I see. This could be fit with a model such as:

Y = intercept + period + treatment + subject + error

And typically the subject effect is modeled with a random effect, leading to a mixed effects model. One option would be to use DESeq2 to scale and variance stabilize the counts with vst(dds) and then to analyze each gene with a mixed effects model in R, and perform multiple test correction on the per-gene p-values.

Another option would be to use limma and try to encode the within-subject correlations with duplicateCorrelation, but you may want to post again to get recommendations for how to encode the within-subject correlations in limma.

ADD REPLY
0
Entering edit mode

Thank you very much, Michael. As I understand from the vst function manual, vst(gene counts) produces a matrix one can directly compare across libraries, right? so no further library size normalisation needed ? The reason why I am asking is that, I tried to reverse the logarithm in the vst output by calculating 2^(vst output matrix), I see library (column) sums are quite different --- maybe I am doing the wrong thing but just want to be clear. Thanks.

ADD REPLY
0
Entering edit mode

Yes the VST data is library size corrected. The column sum is a notoriously bad estimator for whether there are global shifts. Use a boxplot on log scale data (VST output).

ADD REPLY
0
Entering edit mode

Perfect. Thank you very much.

ADD REPLY

Login before adding your answer.

Traffic: 726 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6