I did an analysis a few years ago using ComBat adjusted microarray values fed into limma and was wondering if it's acceptable (Maybe not ideal, but informative). We've been able to validate many of these genes since. I've since read that when using limma with a ComBat modified matrix, the batch covariate should still be used in the subsequent limma design. I did not do that here however, duplicateCorrelation was used on SampleID which is nested within Batch. Does this possibly serve a similar purpose without also putting Batch in design on top of duplicateCorrelation of SampleID? My code follows:
> SampleTable Sample SampleID Batch Treatment 1 A_Tx1 A Batch1 Tx1 5 B_Tx1 B Batch1 Tx1 9 C_Tx1 C Batch2 Tx1 13 D_Tx1 D Batch2 Tx1 2 A_Tx2 A Batch1 Tx2 6 B_Tx2 B Batch1 Tx2 10 C_Tx2 C Batch2 Tx2 14 D_Tx2 D Batch2 Tx2 3 A_Tx3 A Batch1 Tx3 7 B_Tx3 B Batch1 Tx3 11 C_Tx3 C Batch2 Tx3 15 D_Tx3 D Batch2 Tx3 4 A_Tx4 A Batch1 Tx4 8 B_Tx4 B Batch1 Tx4 12 C_Tx4 C Batch2 Tx4 16 D_Tx4 D Batch2 Tx4 modcombat = model.matrix(~1, data=SampleTable) ### Treatment could be included here alternatively combat_edata = ComBat(dat=edata, batch=SampleTable$Batch, mod=modcombat, par.prior=TRUE, prior.plots=FALSE) design <- model.matrix(~0+SampleTable$Treatment) corfit <- duplicateCorrelation(combat_edata,design,block=SampleTable$SampleID) fitV <- lmFit(combat_edata, design, block=SampleTable$SampleID, cor=corfit$consensus.correlation) fitV <- eBayes(fitV) fitV2 <- eBayes(contrasts.fit(fitV, contrasts = c(1,0,-1,0))) sig <- topTable(fitV2, coef = 1 , number = 10000000, p.value = 0.05)
Running it with Batch included in the design in addition to duplicateCorrelation on SampleID produces similar gene lists with the provided code.
Venn Diagram: https://ibb.co/fDLQ7MP
Is this a non-incorrect approach? Thanks!