Hi all,
I have 4 dose groups, technical duplicates, and pre- and post-treatment data. I'm interested in detecting a linear trend across dose groups, adjusting for pre-treatment variability. I'm not certain how to set this up. Data is like this:
patients <- data.frame(Patient = rep(c(1,2,3,4,5,6,7,8), each=4), Treatment = factor(rep(c("Placebo", "Dose1", "Dose2", "Dose3"), each=4), levels = c("Placebo", "Dose1", "Dose2", "Dose3"), ordered=TRUE), Visit = factor(rep(c("Baseline","Baseline","Visit1","Visit1"), 8)))
Is this sufficient?
design <- model.matrix(~ Visit + Treatment, data=eSet)
corfit <- duplicateCorrelation(eSet,design,block=patients$patient)
fit <- lmFit(eSet, design,block=patients$patient,correlation=corfit$consensus,
robust=TRUE)
fit2 <- eBayes(fit)
topTable(fit = fit2, coef = "Treatment.L" )
Or does there need to be a Treatment * Visit interaction, then look at the Treatment.L * Visit1 effect?
Hi Aaron,
Thanks for your time and your help. My apologies for not explaining the experiment well, I've tried to clarify below. I have a few points/questions, if you have the time to help me.
1. Treatment is an ordered factor, so limma reports linear, quadratic and cubic effects (same as lm() does, e.g. setting up contr.poly()). I don't want pairwise comparisons, I'm specifically looking for a linear (dose-response) trend across groups. This is important to the experiment, and was specified by the team as the result of interest.
2. I do have technical replicates (same visit, patient), and 2 different visits for each patient. I wasn't sure how to specifiy this in the model - I'll use avearrays as you suggest, thanks.
3. All patients are untreated at baseline, correct. If I understand you correctly, then I recode the treatment group variable to 'untreated' for all of the baseline samples. I could then specify the comparisons of interest (e.g. Placebo vs dose groups) from the treatment variable, and I would not need the 'visit' variable at all. I can see this, but then I lose the trend test across the dose groups. Can I set it up as follows, using orthogonal contrasts?
contrast.matrix <- makeContrasts(-3*Treatment.Placebo, -1*Treatment.Dose1, 1 * Treatment.Dose2, 3* Treatment.Dose3, levels=design)
Best wishes,
Jen
1. Ah, missed that in your
patients
. Yes, okay. I don't deal with ordered factors much, but it seems that some care may be required in interpretation when the quadratic and cubic terms are significant. For example:3. I would be inclined to formulate the problem in terms of the within-patient difference from the baseline. That is, for each patient, compute the log-fold difference of their visit 1 sample from their baseline. This represents the effect of the dosage, which I presume is the feature that's really of interest. I'll assume you did step 2 above:
With some coaxing, you can formulate the following design matrix:
... where the last three coefficients represent the linear, quadratic, cubic terms on the within-patient differences. You can alternatively put
Patient
induplicateCorrelation
instead of in the design matrix.Thanks Aaron, I think I follow. A trend across within-patient changes sounds like what I want. I've done it this way and the results make sense. I've calculated percent change from baseline and plotted the hits, and they look good (e.g. effect sizes look right).
Thank you for your help!
Hi Aaron,
Thanks again for your help. I'm struggling now with interpreting and explaining the FCs, with this design. I would like to present a table of results with p-values and FCs, then give a clear explanation of the analysis, and explain the results for one example (say, the top hit). I'd also like to present plots for the top hits - the preferred plots are spaghetti plots and/or boxplots of the percent-change-from-baseline. Relating all of these things to each other, correctly presented and explained, is tricky.
Our data is on a log2 scale. We are then looking at the contrasts specified above, which is, as you say, testing for a linear effect on the within-patient differences. So for each patient we have differences between 2 log-transformed data points, then we average by group and test for trend across 3 groups. Except I don't think the model is doing *exactly* that, is it? I'm struggling to understand and explain that part. So what is the logFC (and the FC) that limma produces, in this scenario? It's not really the log of the ratio of the averages of the change-from-baseline, is it? Some of these changes are negative, so how does that come into a FC?
I want to explain it something like this, but I don't think its quite right:
"This analysis is based on within-patient change-from-baseline values. The means of these changes are compared for linear increase or decrease across dose groups. The FC value for each gene is the average of the ratios of the changes-from-baseline, between successive groups."
Any help with understanding and explaining the FCs would be very much appreciated. Thank you!
Don't think in terms of fold changes. Think in terms of log-fold changes. This allows you to consider positive and negative values without any problems.
Now, from the model that I described earlier, you're getting one value per patient - the log-fold change of visit 1 over baseline. You are then fitting the
~Treatment
model to these log-fold changes. So a high-level summary would be that you're looking for changes in the treatment effect with increasing dosage. (Note, these are changes in the effect, not the effects themselves; the former represents a "difference of differences".)The real problem is with the interpretation of the
logFC
reported bytopTable
. This nominally represents the "gradient" of the linear component of the trend with respect to increasing dosage "levels". However, I'm not sure that this corresponds directly to any value of scientific interest. Interpretation of any gradient requires you to know the x-axis units - for example, increase in expression per day, or per mg of drug - but this is not given in the model. Significant quadratic or cubic components can also interfere with interpretation of the linear component, as I mentioned earlier.If you're having trouble with interpreting these components, perhaps you should just use an ordinary factor with levels in the right order, as suggested here.
There's a fair bit to consider here, I'll think it over. My concern is that I want to give some sort of measure of effect size, with a coherent description. Not so easy in this case, I think! I'll consider the ordinary factor approach too. I could compare each dose group to control, then filter out any hits where the effect sizes don't increase across doses. Thanks very much, I really appreciate your time.