Hello,
I have a multi-level experiment that I am feeding into limma to find DEG. I had a question about the normalization strategy, given that I am making both within and between subject comparisons. The experiment is a pre-post design where each subject is given a "treatment". Some patients responded to treatment and others did not.Given that I want to make between subject comparisons, responder/nonresponder, as well as model treatment effect, I was wondering about the normalization strategy. I am interested in finding DEG in responders as an effect of treatment. I am modeling patient as a random effect. TMM uses the sample closest to the average to approximate the others, which is a form of between-sample normalization, which is why I employed it here (given that the between sample heterogeneity would be larger). But, there is still variability from within subjects, given that they were sequenced at two time points. I am wondering if this is the correct strategy for normalization?
remove genes that are lowly expressed across samples
summary(countsPerMillion)
countCheck <- countsPerMillion > 1
head(countCheck)
keep <- which(rowSums(countCheck) >= 2)
dgList <- dgList[keep,]
summary(cpm(dgList))
TMM normalization controls for library size
dge <- calcNormFactors(dgList, method="TMM")
create variable for contrast
phenotype$treat <- factor(paste(matrix$treatment,matrix$responder,sep="."))
create model matrix
design <-model.matrix(~0 + treat + SV1 + SV2,data=matrix
estimate within subject correlation
sup <- voom(dge,design)
corfit <- duplicateCorrelation(sup,design,block=matrix$individual)
corfit$consensus
voom transformation accounting for correlation
counts <- voom(dge,design,block=matrix$individual,correlation=corfit$consensus)
account for correlation in a model with patient block as random effect
fit <- lmFit(counts,design,block=matrix$individual,correlation=corfit$consensus)
fit <- eBayes(fit)
....and then make the contrasts()