Question

Using of limma moderated t-test with "corrected" expression matrix resulting from ComBat batch effect correction

4

Entering edit mode

svlachavas ▴ 830

@svlachavas-7225

Last seen 9 months ago

Germany/Heidelberg/German Cancer Resear…

Dear Bioconductor Community,

based on the very interesting question on a previous post (batch effect : comBat or blocking in limma ?) regarding the possible batch correction methodologies, through the answers created i desided to adress a very important issue in my opinion-as im still a newbie in these specific statistics-which i also dont have find any relevant explanation in any according paper of MOOC. In detail, Dr Johnson kindly mentioned that limma should not be used in conjuction with ComBat, when the latter applied for batch effect correction. With all respect(and of course without arguing against), i would like to ask some further explanations for this specific matter:

this could be more context and experimental design specific ?? that is, it depends on the magnitude of the batch effect, the variances between the batches, or also the experimental design used in Limma itself(i.e. paired or unpaired comparison) ? Or should someone always avoid it ? And if so, what is the main notion ? (It has to do with the "correlation" mentioned in the above post and possibly with the overestimation of DE ?)

Please excuse me for creating this post, but i believe this question is very important and i would like to get the maximum feedback, because i would like to know (as i used a combination of them in one of my recent analyses) if "still" could the combination considered "valid" under certain circumstances !!

Thank you for your consideration on this matter !!

Efstathios

limma batch effect ComBat differential gene expression • 5.0k views

ADD COMMENT • link updated 8.8 years ago by W. Evan Johnson ▴ 850 • written 8.8 years ago by svlachavas ▴ 830

score 5 · Answer 1 · 2015-10-04

5

Entering edit mode

Gordon Smyth 51k

@gordon-smyth

Last seen 20 minutes ago

WEHI, Melbourne, Australia

As a general rule, it is not correct to batch correct before a differential expression analysis. This will result is overstatement of statistical significance. Rather the batch correction should be built into the DE analysis.

This principle is doesn't depend on comBat, or limma, or on the statistical design. It's a general principle.

The only exception would be when the batch correction can be estimated from different data to that used for the DE analysis. That would be unusual however.

ADD COMMENT • link 8.8 years ago Gordon Smyth 51k

0

Entering edit mode

Dear Gordon,

thank you for your crusial explanation-just to pinpoint an example-one thing that bothers me though-, regarding some older conversation about a specific analysis that both you and Aaron kindely provided explanations(C: Correct creation of design matrix in limma regarding a multi-level experiment)- for this specific analysis, i compared the DE genes after Combat with limma(without including the batch effect variable, just the pairs factor) and one without running ComBat but including in limma the batch effect variable along with the pairs(although Aaron have indicated that this would be redudant-Merge different datasets and perform differential expression analysis in limma)---so, as only a small group of genes(40) are different when i compare the two lists(this with the batch effect correction has 37 more DE genes), This could provide a "relative" confidence about the "above general principle" of the overestimation of statistical significance you underline ?

Thank you in advance,

Efstathios

ADD REPLY • link 8.8 years ago svlachavas ▴ 830

3

Entering edit mode

One good way to see the overestimation of significance in action would be to simulate a dataset where all the null hypotheses are true, both with and without an added batch effect. Then analyze both the datasets with all the methods you are considering and compare the resulting p-values to see if any method is systematically over-conservative or over-liberal. A well-behaved method should give a completely uniform (i.e. flat) p-value distribution.

ADD REPLY • link 8.8 years ago Ryan C. Thompson ★ 7.9k

0

Entering edit mode

Dear Dr Thompson,

thank you for idea !! Firstly, regarding my above observation i mention could provide even a small "confidence" about batch effect overestimation ?

Secondly, as im using R about a year and im "relatively" new in statistics:

1. Regarding the simulation you mean, you mean something like permuting the labels? and leave the gene expression as it is, while create two cases: a) take the batch corrected expression set and permute the labels and continue b) the same without correcting for batch effect ?

Please excuse me if my question is naive or irrelevant but i havent performing so far any "simulation" you mention !!

Best,

Efstathios

ADD REPLY • link 8.8 years ago svlachavas ▴ 830

1

Entering edit mode

I would be hesitant to draw conclusions about differences in performance between methods based on data where you don't already know the answer. That's why I recommended simulating a dataset from scratch, so you can control all the variables and you will know exactly which simulated genes are DE and which are not. This allows you do draw ROC curves and such for each method to evaluate their relative performance.

If you want to pursue this, there are a number of "benchmark" papers that give examples of how to do such a simulation. I don't have one handy at the moment, but you should be able to find one on PubMed.

ADD REPLY • link 8.8 years ago Ryan C. Thompson ★ 7.9k

0

Entering edit mode

I see. Then in simple "naive" words, simulate 2 datasets, one from the batch corrected dataset and the other from the same dataset without a batch effect correction, right ? I will search extensively for papers, with a small search i have found

http://www.researchgate.net/publication/255702173_A_Flexible_Microarray_Data_Simulation_Model

Maybe i could also use the below tools from Mr Vegard if they can assist in evaluating this specific issue

ADD REPLY • link 8.8 years ago svlachavas ▴ 830

0

Entering edit mode

Yes, I'm not familiar with that particular method, but it looks like a reasonable simulation method. They also provide an R package, which certainly helps.

ADD REPLY • link 8.8 years ago Ryan C. Thompson ★ 7.9k

0

Entering edit mode

Efstathios, I'm not quite sure what is bothering you or what you mean by "relative confidence".

For your paired experiment, you don't even need a batch correction. The correct analysis is also the simplest.

Other analyses are more complicated and are either not needed or are theoretically wrong.

ADD REPLY • link 8.8 years ago Gordon Smyth 51k

0

Entering edit mode

Dear Gordon,

because i have not such great experience in these issues, i was more "affected" from literature agumenting the necessity when there are two or more different datasets, batch effect correction is essential for merging !! Thus, i was afraid that the pairs factor could not handle a batch effect resulting from both studies !! Nevertheless, as your opinion and your consideration on this matter are trustful and crusial, i will perform the same analysis without batch effect correction and only including the pairs--and inspect and possibly stick with these results.

ADD REPLY • link 8.8 years ago svlachavas ▴ 830

score 2 · Answer 2 · 2015-10-05

Dear Efstathios

I see that you have many thoughts and questions regarding the same things my colleagues and I had when working with the article mention in the above referred post:
"Methods that remove batch effects while retaining group differences may lead to exaggerated confidence in downstream analyses" http://biostatistics.oxfordjournals.org/content/early/2015/08/31/biostatistics.kxv027.full.pdf

I will try to help you out as best I can.

There are 4 different ways of handling batch effect using ComBat and/or limma (or any batch effect correction tool combined with a DE-analysis tool).
1. No batch correction + limma with batch
2. Batch correction + limma with batch
3. Batch correction + limma without batch
4. No batch correction + limma without batch
In our paper we recommend 1. Approach 2 is mentioned as a possibility both by Dr. Johnson (I if understood him correctly) in the above mentioned previous post and in the PARTEK user guide(a commercial software with a similar pair of tools). 1 and 2 should give pretty similar results for most situations. For a fairly balanced experimental design, approach 3 would also give quite similar results. Approach 4 is disregarding the batch effect altogether. 3 and 4 are the problematic ones which can lead to false positive or false negative findings, depending on the experimental design, real group effects, batch correction method used (group as covariate or not) and probably other things. If you have used 3 or 4 in an analysis, the most rigorous path is to reanalyze with approach 1 (or equally 2).

Assessing the adverse effect of approach 3 under different settings was something we spent considerable time on. And as aides in doing so, we made a couple of shiny-tools, "batch-adjustment-simulator" and "fdist-app", which are available via our additional analyses page at GitHub;
https://github.com/ous-uio-bioinfo-core/batch-adjust-warning-reports
Please feel free to try them out or downloading them, but be aware that they are sparsly tested or documented.

Regarding the correlation introduced by a batch effect correction tool, a practical way is to consider this a batch effect as well. The original batch effect is replaced by the batch effect estimation errors. So, if you start out with a batch effect, then correct for batch, you still have a batch effect in your data. Hopefully smaller, maybe not, anyways potentially troublesome. When using analysis tools were the batch effect can not be included (unlike limma), correcting for batch before applying is often the best hope for a more reliable result, but no guarantee.

Vegard.

score 2 · Answer 3 · 2015-10-06

Sorry to put my input in a little late on this one. While I agree with most of what has been said here, there are a few things I would like to point out:

I respectfully disagree with Gordon on the notion that it "is not correct to correct for batch correct before differential expression." What is true are the following:

1) Blocking with Limma is the easiest way to adjust for batch effects, especially when the batch effects are not severe.

2) It is not correct to use ComBat and then assume the data are independent in any downstream analyses. However, if you properly account for the correlation (limma with batch in the design, weighted least squares, etc), its completely fine to use a two step approach and ComBat, even in differential expression cases.

3) ComBat treats batch differences in the mean and variance as random effects. ComBat shrinks batch effect estimates across genes to get more robust estimates, especially in small batches. Blocking in limma will treat batch as a fixed effect (no shrinkage), and does not account for heterozygosity in batch variances. The latter can be addressed if you add an extra step (arrayWeights function), but again the variance differences are assumed to be fixed effects. Because I strongly believe believe that batch effects are systematic in many cases, ComBat often provides a more appropriate model for batch effects, especially when they are severe.

Thanks for this discussion. I think overall it has been helpful.

score 0 · Answer 4 · 2015-10-05

Dear Vegard,

thank your your reply and your detailed answer !! I will definatelly read the paper-Firstly, before creating this post, i asked in the above post you mention and which also discusses the above parer: C: batch effect : comBat or blocking in limma ? -then I created this post, because i had general questions about this specific topic(Mr. Smyth above kindly provided an explanation)-especially with one specific analysis i performed recently was a bit complicated. In detail, i had two separate colorectal cancer datasets[hgu133a dataset-26 paired samples-13 patients & hgu133plus2 dataset-34 samples-17 patients] which have common phenotype information. Thus, i merged the two datasets based on their common probesets(after i have normalized each separately with custom Brain array CDFs), while have some options for batch effect correction, from which i used limma. After, because the samples are paired(each patient has a cancer vs adjucent control sample), i created a paired limma analysis, but without the batch-as Mr. Aaron Lun have mentioned a very important issue: "This is because any differences between batches will be absorbed into the differences between pairs, i.e., the effect of the different chip will be included in the estimate of the patient-specific effect. Inclusion of batch would be redundant and result in a design matrix that is not of full rank, as we can't distinguish between effect of the patient and that of the chip". Thus, i didnt include the batch factor in my desing matrix. In other words, maybe this is different from the approach number 3 you mention above, as is this specific context my analysis is paired ?

Now, as i have underlined also above, i also merged the two datasets without batch effect correction regarding the different study-and then performed limma but with the batch variable also in the design matrix (approach number 1 above), and the results(with some "arbitarty" cutoffs used also in my above procedure) are quite similar only 40 genes less than my first implementation(and the majority of my genes are common). This maybe could also provide some "relative confidence" about the general issue with batch effect correction.

On the other hand, because my analysis is paired, i could also use without batch effect correction and only including the pairs in my design matrix

Finally, with your above github link, i can "inspect" somehow and check a relative "performance" of my procedures ??

Best,

Efstathios