sva::ComBat without covariate of interest?
2
2
Entering edit mode
@brent-pedersen-4815
Last seen 6.9 years ago
United States

Older versions of the SVA manual suggest to include the covariate of
interest in the model when running ComBat -- e.g. 'cancer' in this
manual: http://bioconductor.org/packages/2.9/bioc/vignettes/sva/inst/doc/sva.pdf

The devel and release versions:
http://bioconductor.org/packages/release/bioc/vignettes/sva/inst/doc/sva.pdf
say: "Just as with sva, we then need to create a model matrix for the
adjustment variables, but do not include the variable of interest." (emphasis mine).

Is this correct? Including the covariate of interest seems much more
sensible to me.

sva combat • 3.5k views
ADD COMMENT
0
Entering edit mode
ADD REPLY
2
Entering edit mode
Jeff Leek ▴ 610
@jeff-leek-5015
Last seen 7 months ago
United States

Hi Brent,

Good eye! Similar question here: 

A: ComBat - Including variable of interest in model matrix?

It was pointed out to us that including the covariates in the correction step can lead to some anti-conservative bias in two-step procedures (where you first clean with ComBat, then do a separate differential expression analysis) so for now we are recommending not including the variables, but we are working on a more complete solution that will allow for the use of the variable of interest and correct the downstream analysis. Stay tuned!

Best,

Jeff

ADD COMMENT
0
Entering edit mode

Hi,

What is the current conclusion to this question? The manual I have (compiled May 22nd 2019) says to include covariate of interest, but then it doesn't seem to be included in the example?

Many thanks,

Lucy

ADD REPLY
0
Entering edit mode
@w-evan-johnson-5447
Last seen 17 months ago
United States

All, 

Sorry for the confusion on this. We have been having discussions on how handle covariates in two-step procedures: e.g. (step 1) batch adjustment, followed by (step 2) significance testing. 

The proper way to handle a two step batch/significance test is as follows: 

    - Step 1: Adjust for batch with ComBat and include any adjustment variables, including the covariate of interest.

    - Step 2: Use a modified F or T-test for significance. For example: 

        - The F-test should consist of a modified F statistic=((rss0 - rss1)/(df1 - df0))/(rss1/(n - df1 -  nbatches)), where rss0 is the reduced model residual sum of squared error (SSE), rss1 is the full model SSE, df0 and df1 are the numbers of parameters in the reduced and full models, and nbatches is the number of batches. This should be compared against an F distribution with  df1 - df0 and n - df1 - nbatches degrees of freedom. 

Publications in the literature discussing this issue are forthcoming and we will be changing the sva documentation to reflect this.

Thanks!

Evan

ADD COMMENT

Login before adding your answer.

Traffic: 358 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6