Hi everyone,
Sorry if the following post is too long.
I have below RNA-seq samples with very severe donor batch effects, but the design is balanced. I am running limma+voom pipeline.
sample sample_details 1617 A16_1_1 IFNAR_Mcl1_DKO_MEFs_Treated_ABT 737 161Q A16_1_1 IFNAR_Mcl1_DKO_MEFs_Treated_ABT737_QVD 161U A16_1_1 IFNAR_Mcl1_DKO_MEFs_Untreated 1627 A16_2_10 IFNAR_Mcl1_DKO_MEFs_Treated_ABT 737 162Q A16_2_10 IFNAR_Mcl1_DKO_MEFs_Treated_ABT737_QVD 162U A16_2_10 IFNAR_Mcl1_DKO_MEFs_Untreated 2517 A25_1_9 IFNAR_Mcl1_DKO_MEFs_Treated_ABT 737 251Q A25_1_9 IFNAR_Mcl1_DKO_MEFs_Treated_ABT737_QVD 251U A25_1_9 IFNAR_Mcl1_DKO_MEFs_Untreated 2527 A25_2_3 IFNAR_Mcl1_DKO_MEFs_Treated_ABT 737 252Q A25_2_3 IFNAR_Mcl1_DKO_MEFs_Treated_ABT737_QVD 252U A25_2_3 IFNAR_Mcl1_DKO_MEFs_Untreated
> dmat = model.matrix(~ 0 + sample_details + sample, data=IFNAR$samples) > v <- voom(IFNAR, design=dmat, plot=T) > corr <- duplicateCorrelation(v, dmat, block=IFNAR$samples$sample) > corr$consensus [1] 0.9919988 > v <- voom(IFNAR, design=dmat, block=IFNAR$samples$sample, correlation=corr$consensus, plot=T)
I first tried to correct the batch effect with limma duplicateCorrelation
, however, it seems that from my MDS plot there is still a strong donor effect.
> plotMDS(v$E, label=IFNAR$samples$sample_details, col=as.numeric(IFNAR$samples$sample), cex=.8, main="Colored by donor (batch): After voom")
Then I tried below to explore the data, which is better but not optimal:
> IFNAR_batchcorrected <- removeBatchEffect(v$E, batch=IFNAR$samples$sample, design=model.matrix(~IFNAR$samples$sample_details))
I also tried with ComBat, which gives the best result, maybe because my sample size is small, it is better for an EB method:
> IFNAR_combat <- ComBat(v$E, batch=IFNAR$samples$sample, mod=model.matrix(~IFNAR$samples$sample_details, data=IFNAR$samples), prior.plots=T)
So I use combat adjusted data to continue my analysis.
> dmat <- model.matrix(~ 0 + sample_details, data=IFNAR$samples)
> vfit <- lmFit(IFNAR_combat, dmat) > cfit <- contrasts.fit(vfit, cmat) > efit <- eBayes(cfit) > plotSA(efit)
The problem is when checking the residual-average plot, there is a strong trend between residual and average. I guess this is because the voom precision weights were lost after I merely put combat adjusted data into lmFIt
.
So my question is whether I can simply using following code instead:
> v$E <- IFNAR_combat > fit <- lmFit(v, dat)
Or is there any better suggestion ?