Hi,
I'm having trouble setting up a edgeR design matrix for an RNA-seq dataset I've been working with. We have a number of subjects, with each subjects having 1 - 7 samples. Subjects can be further stratified in 'Responder' and 'Nonresponder'. Here's a simplified example:
Response Subject Sample
0 1 Sample_1_A
0 1 Sample_1_B
0 1 Sample_1_C
1 2 Sample_2_A
1 2 Sample_2_B
0 3 Sample_3_A
0 3 Sample_3_B
0 3 Sample_3_C
0 3 Sample_3_D
1 4 Sample_4_A
We're interested in determining genes that are DE in responders relative to nonresponders, leveraging the fact that many of the samples come from the same subject.
We've tried the following design matrix, but it fails because the rank is not full (since response is a linear combination of subject)
design <- model.matrix(~0 + Subject + Response)
What is the correct design matrix to find genes DE in responders vs. nonresponders? Additionally, while the samples from individual subjects are similar, they are not true biological replicates (they were sampled from different regions). Is there a design matrix setup that can explicitly model differences between samples within a subject, and then allow us to compare responders vs. nonresponders accounting for subject differences? This will presumably kill a lot of our power to compare responders vs. nonresponders, but I'm stilll curious.
I've read the user guide for edgeR (which is very helpful), but I still couldn't determine if any of those examples matched our question.
Thanks for any help!
After combing again through edgeR's user manual, is the correct approach to just use
design <- model.matrix(~0 + Subject)
, and then compare responder with nonresponder via the use ofcontrast
? So if there are n responders and m nonresponders, I might useqlf <- glmQLFTest(fit, contrast = c(-1/m, -1/m, 1/n, 1/n....)
. This gives me results that seem believable (~1k genes up/down, 18k not significant).If this is the correct approach, can anyone shed some light on what exactly that
contrast
is doing? What does it mean to compare the average of responders vs. nonresponders?