Question

Appropriate input of merged datasets regarding batch effect correction with ComBat in R

0

Entering edit mode

svlachavas ▴ 840

@svlachavas-7225

Last seen 7 months ago

Germany/Heidelberg/German Cancer Resear…

Dear Community,

i would like to ask a very specific question about the less "erroneous" procedure regarding the implementation of ComBat and batch effect correction in microarray datasets. In detail, my goal is to test a 39 gene signature that i have aqcuired, through a feature selection procedure in R-based on a training microarray dataset-, in 5 independent datasets, regarding its discriminatory power for a two class-label disease status. All of the testing datasets are from the same platform. Next, i would first perform separate normalization in dataset, then merge them and perform batch effect correction prior testing the classifier. Thus, my crusial question is that i should normalize and batch correct the datasets with all the available probesets, and then subset the merged dataset with the same 39 gene symbols i mentioned above (for the subsequent testing of the classifier) ? In order except for the normalization also for the batch effect correction to be beneficial for taking into account the signals of all probesets? Or my approach is incorrect, and i should subset after normalization each dataset to these 39 genes?

microarray batcheffectcorrection affymetrix microarrays ComBat sva • 1.7k views

ADD COMMENT • link updated 9.5 years ago by chris86 ▴ 420 • written 9.5 years ago by svlachavas ▴ 840

score 0 · Answer 1 · 2016-07-11

0

Entering edit mode

chris86 ▴ 420

@chris86-8408

Last seen 6.1 years ago

UCL, United Kingdom

If your signature is reproducible you should be able to separately normalise each data set without merging them and doing batch correction and apply your classification algorithm within a dataset to groups. You may have to batch correct individual datasets, but I would not do it all together for this purpose.

ADD COMMENT • link 9.5 years ago chris86 ▴ 420