Hi,
I have a very complex paired design to test for differential expression with limma. I have a control group and intervention group (more than 100 RNA-seq samples). The intervention group received a treatment but not the control. So my expression data is structured like this:
Pre1: control pre-treatment
Pre2: intervention pre-treatment
Post1: control post-treatment
Post2: intervention post-treatment
I would like to compare:
Post1-Pre1
Post2-Pre2
Pre2-Pre1
Post2-Post1
But taking into account batch effects (batch1 and batch2) and sex (male/female) . But I am not sure if including batch effect because Pre and Post samples coming from the same individual fall in the same batch.
This is my limma code so far:
head(info)
|
y=DGEList(counts=counts) A<-rowSums(y$counts) isexpr<-A>500 y=y[isexpr,keep.lib.size=FALSE] dim(y) y=calcNormFactors(y) sex=as.factor(info$SEX) batch=as.factor(info$BATCH) Treat=factor(paste(info$TREATMENT, info$GROUP, sep=".")) design=model.matrix(~0+Treat+sex+batch) colnames(design) v=voom(y,design) cor=duplicateCorrelation(v,design,block=info$PATIENT) fit=lmFit(v,design,block=info$PATIENT, correlation=cor$consensus) cm=makeContrasts(CONTROLvsINTERVENTIONforPRE= TreatPre.2-TreatPre.1, CONTROLvsINTERVENTIONforPOST= TreatPost.2-TreatPost.1, PrevsPostforCONTROL= TreatPost.1-TreatPre.1, PrevsPostforINTERVENTION=TreatPost.2-TreatPre.2,levels=design) fit2=contrasts.fit(fit,cm) fit2=eBayes(fit2) summary(decideTests(fit2)) Pre1Pre2_table=topTable(fit2,coef="CONTROLvsINTERVENTIONforPRE",n=Inf, sort="p", p=0.05) Post1Post2_table=topTable(fit2,coef="CONTROLvsINTERVENTIONforPOST",n=Inf, sort="p", p=0.05) Pre1Post1_table=topTable(fit2,coef="PrevsPostforCONTROL",n=Inf, sort="p", p=0.05) Pre2Post2_table=topTable(fit2,coef="PrevsPostforINTERVENTION",n=Inf, sort="p", p=0.05)
You said that blocking directly on patient in the design matrix makes impossible to compare directly across blocking levels, but this is actually what we did in the last posts I compared Post2-Post1 and obtained 13 DE genes.
You are not comparing
Post2
toPost1
expression in those posts. Rather, you are comparing thePost2
log-FC (before/after treatment) to thePost1
log-FC. This is possible as you have computed those log-FCs within patients, such that the blocking factor for each patient cancels out. You can't compare absolute expression between patients when they're blocked in the design, because the blocking factors for different patients would not cancel out and will absorb DE.