I have a diet study experiment, where I would like to test whether a continuous variable,
cont.var, is related with gene expression data.
1) I would like to find the effect of
cont.var within each category (GroupA/GroupB), and within each time-point (T0/T1). The gene expression data was collected at time T0 and then at time T1. I would like to test the effect of the continuous variable in each of the 4 groups (GroupA_T0, GroupA_T1, GroupB_T0, GroupB_T1).
2) Additionally, I have uneven number of observations for each subject. Each subject is either in GroupA or GroupB. So we have a confounding group and subject effect. Multiple but not equal number of samples are collected for each subjects at two time points, the number of observations per subject at a specific time point varies between 0 and 3. There are 70 subjects and 326 samples in total.
3) The might be a batch effect. And the information about sample batch (B1-B5) membership is known.
As I am interested in the effect of
cont.var within each group and at each time point I started modeling the data as follows (rawcount contains the raw expression data):
design <- model.matrix(~ 0 + Lane + Group + Timepoint + Timepoint:Group:cont.var, metadata) v <- voom(rawcount, design, lib.size = libSize, plot = FALSE) # Account for subject (individual) effect: corfit <- duplicateCorrelation(v, design, block=metadata$SubjectID) v <- voom(rawcount, design, lib.size = libSize, block = metadata$SubjectID, correlation = corfit$consensus, plot = plot) fit <- lmFit(v, design, block = metadata$SubjectID, correlation = corfit$consensus) fit <- lmFit(v, design) fit <- contrasts.fit(fit, contrast) fit <- eBayes(fit)
My question is whether my deisgn matrix was declared appropriately, and what should be the contrast parameter for contrasts.fit()?
I am interested in both:
a) whether there is an effect of the continuous variable for a particular gene at all within each group and each timepoint.
b) whether the effect of the continuous variable is different between 2 groups at each time point or between time points for each group individually.
I would really appreciate some help!