**20**wrote:

I have been reviewing the R/limma manual (esp. sections 9.6.1 and 9.6.2), and need conceptual help in applying this.

In brief, I have a time serise with 12 Times(unevenly spaced), and for my Treatment I have 6 replicate Control individuals plus 8 replicate Affected individuals (14 people total, with repeated measurements over time). Call these variables of interest Time and Treatment.

I also have a covariate of interest recorded for each person and timepoint, measuring a phenotype as a continuous variable. Call the variable P.

I have been considering using the regression spline as in example 9.6.2 many time points, thinking this is more appropriate.

My questions:

(A) Is it correct to include a covariate with the regression spline? E.g., if I follow the example where X=ns(Time, df=number); design=model.matrix(~Treatment*X*P).

I am not sure if it's valid to make this sort of an ANCOVA-type regression, or how to modify it?

My primary contrasts of interest would be the Treatment:Time interaction (impact of treatment during a series of 4 baseline, 4 experimental, plus 4 recovery times), and the relation of covariate P on gene expression.

It might be nice to explore other interactions, but I am a bit concerned with over-paramterizing the model and being able to interpret it correctly, and with testing too many contrasts. Perhaps I should have the 3-way interaction in the design matrix and only select a few contrasts (Time:Treatment, P) to analyze.

I did see there was a 'global' correction for multiple contrasts.

(B) How do I select the df? The limma example suggests 3-5, but it has 16 Control + 16 Treatment individuals with no replication.

I could select 12 knots for my 12 timepoints (df=14, or 1+1+12knots), and if the times were evenly spaced that would yield 14 datapoints (6 Control + 6 Treatment subjects) in each knot-bounded interval, if my understanding is right.

Yet that would lead to a model with a large number of parameters (for a design=model.matrix(~Treatment*X*PVT) there are 52 columns in the matrix).

I'm not sure if that is valid for a dataset with only 14 subjects?

I could do a heuristic trial and error with different values for df, but I'm not sure how best to then evaluate and compare results of different models, since there is no AIC/BIC type output.

I assume I ultimately will need to add a Subjects factor to the design matrix and/or use the duplicateCorrelation command to account for repeated measurements (the 12 timepoints) on each subject.

Thank you in advance for your help in setting up the model.matrix and spline functions.

Hilary