Hi DESeq2 Community,
I'm currently working on a comparison of drug effects on tumour and wild-type cells. I have tumour cells from 4 patients (2 males and 2 females) and normal cells from 4 normal people (2 males and 2 females). Each cell sample was treated under three different conditions: plain control, drug A and drug B. The data looks like:
Sample Phenotype Gender Treatment
Patient1 Tumour M 0
Patient1 Tumour M A
Patient1 Tumour M B
Patient2 Tumour M 0
Patient2 Tumour M A
Patient2 Tumour M B
Patient3 Tumour F 0
Patient3 Tumour F A
Patient3 Tumour F B
Patient4 Tumour F 0
Patient4 Tumour F A
Patient4 Tumour F B
Control1 Normal M 0
Control1 Normal M A
Control1 Normal M B
Control2 Normal M 0
Control2 Normal M A
Control2 Normal M B
Control3 Normal F 0
Control3 Normal F A
Control3 Normal F B
Control4 Normal F 0
Control4 Normal F A
Control4 Normal F B
In the PCA plot, the data was divided into 4 groups: male control, female control, male patient and female patient were located at top left, bottom left, top right and bottom right of the plot, respectively. Within each of these groups, individuals were separated vertically. For each individual, treatments effect were separated horizontally.
According to the PCA plot, I have the a design to include all the interactions and set Tumour, Male, Treatment0 as base level:
design(dds) <- ~Phenotype*Gender*Treatment
dds$Phenotype <- relevel(dds$Phenotype, "Tumour")
dds$Gender <- relevel(dds$Gender, "M")
dds$Treatment <- relevel(dds$Treatment, "0")
dds <- DESeq(dds)
Here is my first question: from what I understand, this design is accounting for patient difference. So I think I don't need to build a design like: ~Sample+PhenotypeGenderTreatment (actually this will lead to matrix full rank error). Am I right?
My biggest problem is I'm not sure how to get the main effect of Treatment on the cell, no matter normal or tumour and male or female (if I wanna keep the current design instead of using a simple comparison to combine every factor into one). Is it like the following?
results(dds,contrast=list(c("Treatment_A_vs_0", "PhenotypeNormal.GenderF", "PhenotypeNormal.GenderF.TreatmentA")))
And for the interaction of Phenotype and Treatment B in male:
results(dds,contrast=list(c("PhenotypeNormal.TreatmentB")))
for interaction of Phenotype and Treatment B in female:
results(dds,contrast=list(c("PhenotypeNormal.TreatmentB", "PhenotypeNormal.GenderF.TreatmentB")))
for interaction of Gender and Treatment A in tumour:
results(dds,contrast=list(c("GenderF.TreatmentA")))
for interaction of Gender and Treatment A in normal cell:
results(dds,contrast=list(c("GenderF.TreatmentA", "PhenotypeNormal.GenderF.TreatmentA")))
Am I doing the right things for the above 4 interactions?
And in this design, is that possible to estimate the interaction treatment A and B on the tumour males, all males and all samples respectively?
Thanks indeed for your help.
Thanks Mike, that works well. However it leads to more questions. As I observed from PCA plot, the combination of Phenotype and Gender clustered data together. I'm thinking maybe I can nest the data using Gender instead of creating an extra ind.n that in the vignette. I think this will NOT account for individual difference but treat samples within each Phenotype*Gender combination as a "big" individual. Am I on a right track?
So my model now becomes:
design(dds)=~Phenotype+Phenotype:Gender+Phenotype:Treatment
dds$Phenotype=relevel(dds$Phenotype,"Tumour")
dds$Gender=relevel(dds$Gender,"M")
dds$Treatment=relevel(dds$Treatment,"0")
dds <- DESeq(dds,betaPrior=FALSE)
Then I can get different effect of TreatmentA on Tumour and Normal by:
results(dds, contrast=list("PhenotypeNormal.TreatmentA","PhenotypeTumour.TreatmentA"))
And Gender effect on Normal cells under TreatmentB by:
results(dds, contrast=list("PhenotypeNormal.GenderF","PhenotypeNormal.TreatmentB"))
Am I doing the right thing here?
And I'm wondering if I keep using my original design without considering individual effect, will my codes in the original post achieve the things I suppose to do? I really appreciate your help.
Controlling for individual also controls for sex (note: it's "sex" not "gender") at the same time. It's a higher resolution of controlling for variation across individuals, and supersedes controlling for sex.
Are you actually suspecting some sex * treatment interaction? Just because the samples cluster by sex in the PCA doesn't mean there is necessarily an interesting interaction effect with respect to treatment, it is likely just the subset of chrX and chrY sex-related genes' baseline expression. Unless I had evidence to the contrary, I would stick with controlling for individual, which supersedes sex.
You don't need betaPrior=FALSE if you are using the current version of DESeq2 (v1.10). This is handled automatically. If you are using v1.8 you should upgrade (or wait a few weeks and upgrade to v1.12 with the Bioc release).
The differential treatment effect of A across tumor and normal is correct, except that one should divide tumor/normal not normal/tumor when presenting results (you have normal first in the contrast, which puts it in the numerator, see ?results). You want to present results such that positive LFC => larger fold change in tumor, negative LFC => smaller fold change in tumor.
Thanks indeed, Mike. That's really clear. You're totally right that there's no interaction of sex*treatment, even if I include them in the design. Cluster by sex in PCA is exactly because the expression of chrX and chrY as you suggested.
In case I'm doing the right thing about controlling for individual, may I ask if the following is right? I added a column "ID.n", and now the data table looks like:
Sample Phenotype Gender Treatment ID.n
Patient1 Tumour M 0 1
Patient1 Tumour M A 1
Patient1 Tumour M B 1
Patient2 Tumour M 0 2
Patient2 Tumour M A 2
Patient2 Tumour M B 2
Patient3 Tumour F 0 3
Patient3 Tumour F A 3
Patient3 Tumour F B 3
Patient4 Tumour F 0 4
Patient4 Tumour F A 4
Patient4 Tumour F B 4
Control1 Normal M 0 1
Control1 Normal M A 1
Control1 Normal M B 1
Control2 Normal M 0 2
Control2 Normal M A 2
Control2 Normal M B 2
Control3 Normal F 0 3
Control3 Normal F A 3
Control3 Normal F B 3
Control4 Normal F 0 4
Control4 Normal F A 4
Control4 Normal F B 4
And then the design would be:
dds$Treatment = relevel(dds$Treatment, "0")
design(dds) = ~Phenotype + Treatment +
Phenotype:Treatment
+Phenotype:ID.n
dds = DESeq(dds)
> resultsNames(dds)
[1] "Intercept" "Phenotype_Tumour_vs_Normal"
[3] "Treatment_A_vs_0" "Treatment_B_vs_0"
[5] "PhenotypeTumour.TreatmentA" "PhenotypeTumour.TreatmentB"
[7] "PhenotypeNormal.ID.nested" "PhenotypeTumour.ID.nested"
Then I can get the effect of Treatment A on Normal and Tumour respectively by the following two lines:
results(dds, contrast=list(c("Treatment_A_vs_0")),alpha=0.05) #Treatment A on Normal
results(dds, contrast=list(c("Treatment_A_vs_0","PhenotypeTumour.TreatmentA")),alpha=0.05) #Treatment A on Tumour
But how can I get the overall effect of Treatment A across both Phenotypes (regardless Tumour or Normal)?
Notice in the vignette example we leave out the main effect of condition (here treatment). I would recommend to follow that example fully, so ~Phenotype + Phenotype:ID.n + Phenotype:Treatment.
There is not a coefficient that represents the overall effect of treatment A on both phenotypes in an interaction model. You could average the phenotype-specific effects. You may want to consult with a local statistician about how to interpret the model.
Thanks for your patient and kind guidance, Mike. Yes, I can understand why the two designs are actually the same now. You made me a clearer mind and I'm going to read more about statistical references of interaction. Thanks for your help again.