Hi All,
My question below might have been answered in support.bioconductor, please direct me to those of most relevant answers, in case if I overlooked them, although I've been through many posts.
Here is my Target file:
Sample | Species | Condition | Time | Replicate | Cond_Cult |
1 | A | C | 1 | 1 | A.C |
2 | A | C | 1 | 2 | A.C |
3 | A | C | 2 | 1 | A.C |
. | A | C | 2 | 2 | A.C |
. | B | C | 1 | 1 | B.C |
. | B | C | 1 | 2 | B.C |
. | B | C | 2 | 1 | B.C |
. | B | C | 2 | 2 | B.C |
. | . | . | . | . | ... |
. | . | . | . | . | ... |
. | . | . | . | . | ... |
. | A | C | 7 | 1 | A.C |
. | A | C | 7 | 2 | A.C |
. | A | C | 8 | 1 | A.C |
. | A | C | 8 | 2 | A.C |
. | B | C | 7 | 1 | B.C |
. | B | C | 7 | 2 | B.C |
. | B | C | 8 | 1 | B.C |
. | B | C | 8 | 2 | B.C |
. | . | . | . | . | ... |
. | . | . | . | . | ... |
. | . | . | . | . | ... |
. | A | T | 1 | 1 | A.T |
. | A | T | 1 | 2 | A.T |
. | A | T | 1 | 3 | A.T |
. | A | T | 2 | 1 | A.T |
. | A | T | 2 | 2 | A.T |
. | A | T | 2 | 3 | A.T |
. | B | T | 1 | 1 | B.T |
. | B | T | 1 | 2 | B.T |
. | B | T | 1 | 3 | B.T |
. | B | T | 2 | 1 | B.T |
. | B | T | 2 | 2 | B.T |
. | B | T | 2 | 3 | B.T |
. | . | . | . | . | ... |
. | . | . | . | . | ... |
. | . | . | . | . | ... |
. | A | T | 7 | 1 | A.T |
. | A | T | 7 | 2 | A.T |
. | A | T | 7 | 3 | A.T |
. | A | T | 8 | 1 | A.T |
. | A | T | 8 | 2 | A.T |
. | A | T | 8 | 3 | A.T |
. | B | T | 7 | 1 | B.T |
. | B | T | 7 | 2 | B.T |
. | B | T | 7 | 3 | B.T |
. | B | T | 8 | 1 | B.T |
. | B | T | 8 | 2 | B.T |
96 | B | T | 8 | 3 | B.T |
Description:
There are 2 species (A & B), each of them have 2 levels of condition (Control & Treatment) and 2-3 levels of Replications.
There are two main questions, plus a few more:
Q1. How the two species behave over time once subjected to treatment (expression pattern).
Q2. How many expression patterns can be seen and what genes represent a cluster.
I followed part 9.6.2 of the Limma package;
f <- factor(Target$Cond_Cult, levels=unique(Target$Cond_Cult)) # levels: A.C B.C A.T B.T
Q3. Is A.C my primary level?
# Creating design matrix design1 <- model.matrix(~0+f) # Creating DGEList object (dge) dge <- DGEList(counts=myCounts, genes = myIDs) # Filtering keep <- filterByExpr(dge, design1) dge <- dge[keep, keep.lib.sizes=FALSE] # TMM Normalization dge <- calcNormFactors(dge) # the voom transformation is applied to the normalized and filtered DGEList object v <- voom(dge, design1, plot=TRUE) # Setting up a basis for a natural regression spline library(splines) X <- ns(Target$Time, df=5) Group <- factor(Target$Cond_Cult) design2 <- model.matrix(~Group*X) colnames(design2) <- gsub("Group", "", colnames(design2)) colnames(design2) # [1] "(Intercept)" "K.I" "P.C" "P.I" # [5] "X1" "X2" "X3" "X4" # [9] "X5" "K.I:X1" "P.C:X1" "P.I:X1" # [13] "K.I:X2" "P.C:X2" "P.I:X2" "K.I:X3" # [17] "P.C:X3" "P.I:X3" "K.I:X4" "P.C:X4" # [21] "P.I:X4" "K.I:X5" "P.C:X5" "P.I:X5" # Fitting Spline model fit <- lmFit(v, design2) fit <- eBayes(fit, robust = TRUE) topTable(fit, coef=10:24, number=20)
Q4. Is my design matrix (design2) appropriate for my main questions?
Q5. How to find out the number of gene clusters and extract them from the analysis (command in R).
Thank you all.
Hossein