Entering edit mode
Hey Magda,
The two-step method is still a reasonable approach. It has worked well
for me in multiple situations. I do have a beta version of a ComBat
version that will handle two batch variables at the same time. It
works well in theory--but I have yet to test it thoroughly across
multiple datasets. I'm willing to share the code if you want to test
it on your data (let me know).
ComBat in the sva package can handle numeric covariates, but it does
not deal with continuous batch variables. Adjusting the mean of a
continuous batch variable would be straight-forward (assuming a linear
effect), but the variance adjustment would be very tricky.
Ultimately, since the two-step approach seems to have worked, I think
your best option is to just move forward with those results.
Thanks!
Evan
On Feb 19, 2014, at 4:00 AM, <bioconductor-request at="" r-project.org="">
<bioconductor-request at="" r-project.org=""> wrote:
> Message: 23
> Date: Tue, 18 Feb 2014 16:45:12 -0800
> From: Magda Price <magdaprice at="" gmail.com="">
> To: "bioconductor at r-project.org" <bioconductor at="" r-project.org="">
> Subject: [BioC] ComBat: 2 adjustment variables & continuous
adjustment
> variables
> Message-ID:
> <cadkr4v=ydd1abjxfhtd+xwq8mzmp_=urhvdtpxoturpqjzb7tg at="" mail.gmail.com="">
> Content-Type: text/plain
>
> Hi!
>
> I'm writing with a few questions about applying ComBat (sva package)
to a
> set of ~50 samples run on the the Illumina Infinium
HumanMethylation450
> BeadChip array (~450,000 DNA methylation data points).
>
> There is a large amount of variation in my data due to both the
batch the
> samples were run in (3 different batches), in addition to the
position they
> were located on the chip - specifically the row (6 different rows),
but not
> the column. The chips are set up in a 6 row * 2 column format like
this:
>
>
> sample 01 sample 02
> sample 03 sample 04
> sample 05 sample 06
> sample 07 sample 08
> sample 09 sample 10
> sample 11 sample 12
>
>
> I read Dr. Evan Johnson's suggestions to someone else with this
> "2-batch-effect-variable" problem in the ComBat google group (
> https://groups.google.com/forum/#!topic/combat-user-
forum/PcTxNlaUmAI). He
> had 2 good suggestions:
>
> 1. Combine the two batch variables into one, if 3-4 reps are left
in
> each batch
> 2. Use ComBat twice, adjusting for the first batch using the
second
> batch as a covariate, and then adjust for the second batch.
>
> I cannot go with the first suggestion because combining the 2 batch
> variables would create 18 batch categories (3 batches * 6 rows), and
I
> would not have enough replicates per batch category.
>
> So I tried the second option - applying ComBat twice. I first
corrected for
> row and then took the row-corrected data and applied ComBat again,
> correcting for batch. It seems to have worked & the correlation of
my
> technical replicates improves. I am seeking advice on two points:
>
> 1. The google group post is now a few years old, is it still
thought
> that the step-wise correction is a valid approach?
> 2. Row would be better treated as a continuous adjustment variable
than
> a factor. In the version of sva that I am using (3.0.2) I believe
that only
> factors adjustment variables are supported. I have seen mention in
a few
> forums that there might be an update to ComBat to adjust for a
numeric
> batch variable, is one available?
>
> Thank you in advanced for your help!
>
> Magda Price,
> University of British Columbia