Model and Contrasts with 4 variables
1
0
Entering edit mode
Cristina • 0
@bcb87936
Last seen 17 months ago
United Kingdom

I have a question regarding a RNA-seq data set I have.

This data consists of 2 disease groups (Disease 1, Disease 2), 2 treatments (Treatment 1, Treatment 2), and 2 times points (Timepoint 1, Timepoint 2). And in total 30 Patients.

I want to make the following comparisons:

  • Timepoint 2 vs Timepoint 1 for Patients on Treatment 1
  • Timepoint 2 vs Timepoint 1 for Patients on Treatment 2
  • Treatment 1 vs Treatment 2 irrespective of Disease at Timepoint 2
  • Treatment 1 vs Treatment 2 for Patients with Disease 1 at Timepoint 2
  • Treatment 1 vs Treatment 2 for Patients with Disease 2 at Timepoint 2

What would be the best way to design the model and do the comparisons?

contrast edgeR DESeq2 • 1.1k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 2 days ago
United States

Have you read the respective vignettes for edgeR and DESeq2? Both provide extensive examples. This support site is meant to provide technical advice for very specific questions about how to use Bioconductor software rather than provide basic statistics advice. Using statistical analysis tools presupposes that you already understand the statistics part, and if you don't then it's up to you to either learn or find someone local who does have the required knowledge.

ADD COMMENT
0
Entering edit mode

Thank you - yes I have read both manuals which are very helpful. However, I couldn't find an example that did exactly what I wanted. I have done the following:

`design <- model.matrix(~ Donor)

Treatment1.Timepoint2 <- Treatment== "Treatment1" & Timepoint=="Timepoint2"
Treatment2.Timepoint2 <- Treatment== "Treatment2" & Timepoint=="Timepoint2"
design <- cbind(design,  Treatment1.Timepoint2,  Treatment2.Timepoint2)

dge <- estimateDisp(dge, design)

fit <- glmQLFit(dge, design)
qlf1 <- glmQLFTest(fit, coef="Treatment1.Timepoint2")
topTags(qlf1)

qlf2 <- glmQLFTest(fit, coef="Treatment2.Timepoint2")
topTags(qlf2)

qlf3 <- glmQLFTest(fit, coef=31:32) #where 31:32 are "Treatment1.Timepoint2":"Treatment2.Timepoint2"
 topTags(qlf3)

qlf4 <- glmQLFTest(fit, contrast=c(rep(0, 30),1,-1)) #where 0:30 are the Donors(n=30)
topTags(qlf4)
ADD REPLY
0
Entering edit mode

The comparisons you describe are not 100% clear, but the way I interpret them, qlf1 and qlf2 answer your first two comparisons. The remainder are all within timepoint2 comparisons, at which point there is no longer any pairing, so you can just make the usual comparisons. Normally the best thing to do is to fit a cell means model and make the correct comparisons using contrasts. Something like

combo <- paste(Treatment, Timepoint, Disease, sep = ".")
design2 <- model.matrix(~ 0 + combo)
contrast <- makeContrasts((Treatment1.Timepoint2.Disease1 + Treatment1.Timepoint2.Disease2)/2 - (Treatment2.Timepoint2.Disease1 + Treatment2.Timepoint2.Disease2)/2,
                           Treatment2.Timepoint2.Disease1 - Treatment1.Timepoint2.Disease1,
                           Treatment2.Timepoint2.Disease2 - Treatment1.Timepoint2.Disease2,
                            levels = design2)

Your qlf3 comparison is testing for any genes that change expression between time 1 and time 2 for either treatment, and your qlf4 is doing the interaction term, where you are asking for genes that react differently between timepoints depending on the treatment.

ADD REPLY
0
Entering edit mode

Yes, you got the comparisons I am trying to make! Thanks for the explanation. I was just wondering how do you adjust for baseline differences between the patients in your model? Should you not consider Donors for the model?

ADD REPLY
0
Entering edit mode

You could. As I noted, qlf3 is already doing an F-test for any difference between time 1 and 2 for either treatment. In other words, Treatment1.Timepoint2 and Treatment1.Timepoint2 in your model are the difference between time 2 and time 1, within each of the two treatment arms. And qlf3 is doing an F-test that tests for either (or both) of those coefficients being different from zero. You could also do

glmQLFTest(fit, contrast = rep(c(0,0.5), c(30,2)))

Which would test that the average of the difference between time 1 and 2 for the two treatments is different from zero, which would adjust for the baseline donor effect. But that is not the same as what you asked for earlier.

ADD REPLY

Login before adding your answer.

Traffic: 853 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6