Please help me build a design that incorporates between/within-patient variance. I see that vignette('DESeq2'), offers a solution for crossover design (the same patient gets each treatment): to repeat the patient label within each treatment. But this is a randomized placebo-controlled clinical trial without crossover.
We have 2 samples from each patient at each timepoint, 2 timepoints (before,after), 2 treatments (1 per patient). Total 4 samples per patient. I am blinded as to which treatment group is placebo vs active.
> samples treatment time pairs patient.Repeat patient.n grA.1002.lesion1.before A 1Before a 1 1 grA.1002.lesion2.before A 1Before b 1 1 grA.1002.lesion1.after A 2After a 1 1 grA.1002.lesion2.after A 2After b 1 1 grB.1016.lesion1.before B 1Before a 2 2 grB.1016.lesion2.before B 1Before b 2 2 grB.1016.lesion1.after B 2After a 2 2 grB.1016.lesion2.after B 2After b 2 2 grA.2002.lesion1.before A 1Before a 3 3 grA.2002.lesion2.before A 1Before b 3 3 grA.2002.lesion1.after A 2After a 3 3 grA.2002.lesion2.after A 2After b 3 3 grB.2011.lesion1.before B 1Before a 1 4 grB.2011.lesion2.before B 1Before b 1 4 grB.2011.lesion1.after B 2After a 1 4 grB.2011.lesion2.after B 2After b 1 4 grA.3052.lesion1.before A 1Before a 2 5 grA.3052.lesion2.before A 1Before b 2 5 grA.3052.lesion1.after A 2After a 2 5 grA.3052.lesion2.after A 2After b 2 5 grB.3058.lesion1.before B 1Before a 3 6 grB.3058.lesion2.before B 1Before b 3 6 grB.3058.lesion1.after B 2After a 3 6 grB.3058.lesion2.after B 2After b 3 6
Certainly the most important comparisons in the model are: design=formula(~time*treatment)
And this works.
However, I need help including within patient variance.
~patient.n + time*treatment does not work:
Error in checkFullRank(modelMatrix) : the model matrix is not full rank, so the model cannot be fit as specified. One or more variables or interaction terms in the design formula are linear combinations of the others and must be removed.
Again, when making the patient.Repeat column redundant across the treatments, and inputting design=formula(~patient.Repeat + time*treatment), it gives results, but comparisons are spuriously made with patient1 as the reference, without comparing patient2 to patient3, and there is a spurious combination across the treatment groups:
~patient.Repeat + time*treatment resultsNames(dds) [1] "Intercept" "patient.Repeat_2_vs_1" "patient.Repeat_3_vs_1" [4] "time_2After_vs_1Before" "treatment_A_vs_B" "time2After.treatmentA"
Of course the software is doing what is told, I'm not blaming the software, just trying to ask how to properly represent the experimental model.
I have tried everything I can think of (input as design=formula(*) into DESeqDataSetFromMatrix, prior to running DESeq). They all either fail, or seem to misrepresent the situation.
The best I have come up with that works:
~pairs:treatment + treatment*time resultsNames(dds) [1] "Intercept" "treatment_A_vs_B" "time_2After_vs_1Before" [4] "treatmentB.pairsb" "treatmentA.pairsb" "treatmentA.time2After"
Similarly:
~pairs:treatment:time + treatment*time resultsNames(dds) [1] "Intercept" "treatment_A_vs_B" [3] "time_2After_vs_1Before" "treatmentA.time2After" [5] "treatmentB.time1Before.pairsb" "treatmentA.time1Before.pairsb" [7] "treatmentB.time2After.pairsb" "treatmentA.time2After.pairsb"
Again, this runs, and I believe makes sense. I am challenged by wrapping my head around a lack of "pairsA" being directly represented - seems okay as this would be the reference to analyze "pairsB." I was hoping to have "time1Before" be the reference in this 3-way combination, but haven't figured out how to do that.
And, of course the "treatment2Treatment.time2After" should come last, but again haven't figured out how to make it so. Yet I can just insert this into contrast:
results(dds,contrast=list(c("treatmentA.time2After"))) #or results(dds,contrast=list(c("treatmentB.time2After")))
This seems like it might be okay... what do y'all think? Do I need betaPrior=FALSE?
One final idea is to combine the treatment variables, then set the contrasts manually:
samples$Group <- factor(paste0(samples$treatment,samples$time,samples$patient.n)) dds <- DESeqDataSetFromMatrix(countData = CountTable, colData=samples, design=formula(~Group)) dds <- DESeq(dds) > resultsNames(dds) [1] "Intercept" "GroupB1Before2" "GroupB1Before4" [4] "GroupB1Before6" "GroupB2After2" "GroupB2After4" [7] "GroupB2After6" "GroupA1Before1" "GroupA1Before3" [10] "GroupA1Before5" "GroupA2After1" "GroupA2After3" [13] "GroupA2After5" results(dds,contrast=list(c("GroupA2After1","GroupA2After3","GroupA2After5"),c("GroupA1Before1","GroupA1Before3","GroupA1Before5"))) #or results(dds,contrast=list(c("GroupB2After2","GroupB2After4","GroupB2After6"),c("GroupB1Before2","GroupB1Before4","GroupB1Before6")))
Help!
Further details:
- this happens to be taxonomic calls (~microbiome data).
> sessionInfo() R version 3.3.1 (2016-06-21) Platform: x86_64-apple-darwin13.4.0 (64-bit) Running under: OS X 10.11.6 (El Capitan) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets [8] methods base other attached packages: [1] DESeq2_1.14.1 SummarizedExperiment_1.4.0 [3] Biobase_2.34.0 GenomicRanges_1.26.2 [5] GenomeInfoDb_1.10.2 IRanges_2.8.1 [7] S4Vectors_0.12.1 BiocGenerics_0.20.0
(updated to fix some variable names)
Thank you for your time Prof. Love. Here are the resulting comparisons:
Terms 1,2,7,8 make perfect sense to me. For the remaining, I wonder if it is possible to make terms txB.pt1, txB.pt2, txB.pt3 instead of what is given? Wouldn't that be a more representative comparison?
Also, if you could, please respond to the option above shown under "One final idea is to combine the treatment variables, then set the contrasts manually"
I'd suggest you meet with a local statistician who can help to explain what the terms in the design mean. The design as you have it now will allow you to compare the After vs Before effect for treatment A or B separately, and to contrast them, and it controls for the patient effects (separately for the patients in each treatment).
Thanks very much. Will do. Case closed.