Search
Question: How do I set up the design when I have multiple regions and duplicate timepoints?
0
14 months ago by
tjest0
tjest0 wrote:

I have a set of data that I'm having trouble setting up the design.

At time 0, each subject underwent a procedure that generated three regions of samples (groups?); Experimental tissue 1, Experimental tissue 2, and a control tissue.  We want to see what happens to the genes at four time points after that.  Mostly tissue 1 vs control, but 1v2 and 2vC would also be useful.

One issue is that there's multiple data for each time point and they're all from different subjects.  There's from two to five subjects at each time point. There's also baseline data from time point 0 before the procedure that could I suppose be applied to be the baseline of all three experimental groups.

Here is the code I have so far, based off the limma user's guide 9.6

eset <- affy::rma(data)

X <- ns(targets$TimeType,df=2) # the number of days since day 0. Group <- factor(targets$RegionType) # group number, 1, 2, or 3.
design <- model.matrix(~0+Group*X)
fit <- lmFit(eset, design)
fit <- eBayes(fit)

Even just trying to do region 1 vs control (omitting region 2 from the file), I'm getting every single gene as being significant, wildly so, like 10^-57 to 10^-11.

The other weird thing is that I suspected it was a problem with not setting the coef correctly, but I cannot set the coef to anything.  Every single thing I choose, I get an error message

"Error in topTableF(fit, coef = 2, number = 29000, adjust = "BH") :
unused argument (coef = 2)"

Traceback() is no help.

Any help you can give would be greatly appreciated.  Thank you.

modified 14 months ago by Aaron Lun19k • written 14 months ago by tjest0
0
14 months ago by
Aaron Lun19k
Cambridge, United Kingdom
Aaron Lun19k wrote:

A cursory scan of the documentation would reveal that topTableF has no coef argument. Moreover, without an intercept, it doesn't make sense to use topTableF. Your null hypothesis would be that all groups and timepoints have an average-log expression of zero, which is a silly thing to test.

The more interesting tests would be whether the spline coefficients for each tissue are non-zero, to determine if there is a time effect for each tissue; or whether the spline coefficients for one tissue are different to those of the other tissues, to determine if there is a tissue-specific time effect. The first case would require use of topTable with coef set appropriately, while the second case would require contrasts.fit.

In any case, you should also be accounting for the fact that each triplet of samples come from the same patient. You would probably need to use duplicateCorrelation for this; blocking on patient in the design matrix would not allow you to compare across timepoints within a single tissue.

ADD COMMENTlink modified 14 months ago • written 14 months ago by Aaron Lun19k

Thank you.

What happened is that I was experimenting with several different sections of the limma user's guide to try to find the most appropriate setup and I was commenting out or modifying lines from one section and inserting from another.  I never even noticed that topTableF was a thing, much less that it was used in section 9.5 but not in 9.6.  Had the error message said "topTableF does not support coef", I probably would have realized I was using the wrong function, but when it said it wasn't using it, I was trying to search for what I must have been doing with the other arguments that was superseding it.

Similar thing happened with the intercept, not realizing it was (~0+Group) in one place but (~Group*X) in that particular example. I got so caught up in thinking my p-values were wrong and I needed to have the right coefficient, I was blind to any other issues. Turns out I just needed another set of eyes, so thank you again.

I'll look into duplicateCorrelation as well. Thank you, I probably would not have realized that would be necessary.