Unexpected clustering of samples after batch effect removal with ComBat
0
0
Entering edit mode
jaro.slamecka ▴ 140
@jaroslamecka-7419
Last seen 3 days ago
Mitchell Cancer Institute, Mobile AL, U…

Hi,

I'm trying to find differentially expressed genes between two expression microarray datasets, both Illumina HT-12 v4, one only has 3 samples (3 different patients), the other one has 5 in triplicates (biological, clones). I loaded the datasets using lumi and individually transformed them (vst) and normalized (rsn). Then I combined the datasets based on common features. The means in the first dataset are around 7.4, in the second around 8.4. So I tried to remove the batch effect using ComBat with the following:

pheno = pData(EDATA)
edata = exprs(EDATA)
batch = pheno$batch
modcombat = model.matrix(~1, data=pheno)
combat_edata = ComBat(dat=edata, batch=batch, mod=modcombat, numCovs=NULL, par.prior=TRUE, prior.plots=FALSE)
exprs(EDATA) = combat_edata ## to plug the expression values back into the original ExpressionSet

My problem is, after ComBat, the samples cluster unexpectedly. The 3 samples that make up the first dataset should be clustering together and separately from the rest of the samples. But instead, not even the triplicate samples cluster together anymore. Is there something wrong with this approach? Or could ComBat be destroying the differences? The RNAs come from cells of similar type - all iPSC - but very different culture systems. Any help would be greatly appreciated.

Jaro

 

ComBat • 2.3k views
ADD COMMENT
1
Entering edit mode

If the data are all on the same array, why are you combining data based on common features. If they are the same array, by definition the features are all common.

In addition, if you are trying to make comparisons between batches and the biological replicates are all either in one or the other batch (e.g., if you are trying to compare the 3 samples in the first set vs the 5 in the second set), then you should note that it is not possible to remove the batch effect. In this situation, the biological differences are completely confounded with batch and there is no way to unscramble that egg.
 

ADD REPLY

Login before adding your answer.

Traffic: 498 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6