Hi,
I did an analysis a few years ago using ComBat adjusted microarray values fed into limma and was wondering if it's acceptable (Maybe not ideal, but informative). We've been able to validate many of these genes since. I've since read that when using limma with a ComBat modified matrix, the batch covariate should still be used in the subsequent limma design. I did not do that here however, duplicateCorrelation was used on SampleID which is nested within Batch. Does this possibly serve a similar purpose without also putting Batch in design on top of duplicateCorrelation of SampleID? My code follows:
> SampleTable
Sample SampleID Batch Treatment
1 A_Tx1 A Batch1 Tx1
5 B_Tx1 B Batch1 Tx1
9 C_Tx1 C Batch2 Tx1
13 D_Tx1 D Batch2 Tx1
2 A_Tx2 A Batch1 Tx2
6 B_Tx2 B Batch1 Tx2
10 C_Tx2 C Batch2 Tx2
14 D_Tx2 D Batch2 Tx2
3 A_Tx3 A Batch1 Tx3
7 B_Tx3 B Batch1 Tx3
11 C_Tx3 C Batch2 Tx3
15 D_Tx3 D Batch2 Tx3
4 A_Tx4 A Batch1 Tx4
8 B_Tx4 B Batch1 Tx4
12 C_Tx4 C Batch2 Tx4
16 D_Tx4 D Batch2 Tx4
modcombat = model.matrix(~1, data=SampleTable) ### Treatment could be included here alternatively
combat_edata = ComBat(dat=edata, batch=SampleTable$Batch, mod=modcombat, par.prior=TRUE, prior.plots=FALSE)
design <- model.matrix(~0+SampleTable$Treatment)
corfit <- duplicateCorrelation(combat_edata,design,block=SampleTable$SampleID)
fitV <- lmFit(combat_edata, design, block=SampleTable$SampleID, cor=corfit$consensus.correlation)
fitV <- eBayes(fitV)
fitV2 <- eBayes(contrasts.fit(fitV, contrasts = c(1,0,-1,0)))
sig <- topTable(fitV2, coef = 1 , number = 10000000, p.value = 0.05)
Running it with Batch included in the design in addition to duplicateCorrelation on SampleID produces similar gene lists with the provided code.
Venn Diagram: https://ibb.co/fDLQ7MP
Is this a non-incorrect approach? Thanks!
Hi Dr. Smyth,
Thanks so much for the response. This is what I expected, but wanted to double check. Luckily, the genes that are sensitive to adding Batch to the design are not ones we're particularly interested in for this study. Thanks again!