Entering edit mode
Magda Price
▴
60
@magda-price-6411
Last seen 8.0 years ago
Johnson, William Evan <wej at="" ...=""> writes:
>
> Hey Magda,
>
> The two-step method is still a reasonable approach. It has worked
well for
me in multiple situations. I do
> have a beta version of a ComBat version that will handle two batch
variables at the same time. It works well
> in theory--but I have yet to test it thoroughly across multiple
datasets.
I'm willing to share the code if
> you want to test it on your data (let me know).
>
> ComBat in the sva package can handle numeric covariates, but it does
not
deal with continuous batch
> variables. Adjusting the mean of a continuous batch variable would
be
straight-forward (assuming a
> linear effect), but the variance adjustment would be very tricky.
>
> Ultimately, since the two-step approach seems to have worked, I
think your
best option is to just move
> forward with those results.
>
> Thanks!
>
> Evan
>
> On Feb 19, 2014, at 4:00 AM, <bioconductor-request at="" ...="">
> <bioconductor-request at="" ...=""> wrote:
>
> > Message: 23
> > Date: Tue, 18 Feb 2014 16:45:12 -0800
> > From: Magda Price <magdaprice at="" ...="">
> > To: "bioconductor at ..." <bioconductor at="" ...="">
> > Subject: [BioC] ComBat: 2 adjustment variables & continuous
adjustment
> > variables
> > Message-ID:
> > <cadkr4v=ydd1abjxfhtd+xwq8mzmp_=urhvdtpxoturpqjzb7tg at="" ...="">
> > Content-Type: text/plain
> >
> > Hi!
> >
> > I'm writing with a few questions about applying ComBat (sva
package) to
a
> > set of ~50 samples run on the the Illumina Infinium
HumanMethylation450
> > BeadChip array (~450,000 DNA methylation data points).
> >
> > There is a large amount of variation in my data due to both the
batch
the
> > samples were run in (3 different batches), in addition to the
position
they
> > were located on the chip - specifically the row (6 different
rows), but
not
> > the column. The chips are set up in a 6 row * 2 column format like
this:
> >
> >
> > sample 01 sample 02
> > sample 03 sample 04
> > sample 05 sample 06
> > sample 07 sample 08
> > sample 09 sample 10
> > sample 11 sample 12
> >
> >
> > I read Dr. Evan Johnson's suggestions to someone else with this
> > "2-batch-effect-variable" problem in the ComBat google group (
> > https://groups.google.com/forum/#!topic/combat-user-
forum/PcTxNlaUmAI).
He
> > had 2 good suggestions:
> >
> > 1. Combine the two batch variables into one, if 3-4 reps are
left in
> > each batch
> > 2. Use ComBat twice, adjusting for the first batch using the
second
> > batch as a covariate, and then adjust for the second batch.
> >
> > I cannot go with the first suggestion because combining the 2
batch
> > variables would create 18 batch categories (3 batches * 6 rows),
and I
> > would not have enough replicates per batch category.
> >
> > So I tried the second option - applying ComBat twice. I first
corrected
for
> > row and then took the row-corrected data and applied ComBat again,
> > correcting for batch. It seems to have worked & the correlation of
my
> > technical replicates improves. I am seeking advice on two points:
> >
> > 1. The google group post is now a few years old, is it still
thought
> > that the step-wise correction is a valid approach?
> > 2. Row would be better treated as a continuous adjustment
variable
than
> > a factor. In the version of sva that I am using (3.0.2) I
believe that
only
> > factors adjustment variables are supported. I have seen mention
in a
few
> > forums that there might be an update to ComBat to adjust for a
numeric
> > batch variable, is one available?
> >
> > Thank you in advanced for your help!
> >
> > Magda Price,
> > University of British Columbia
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at ...
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
Hi Evan,
Thanks for your response & so sorry for my delay, I wasn't notified by
e-
mail that you had responded.
Since I wrote the first note, a few things have changed:
1. I have discovered a third batch variable (chip, in addition to
plate &
row)!
2. Based on forum feedback, I figured I was okay to stick with all
factor
adjustment variables.
An additional question that has come up in reference to your
suggestion:
> > Use ComBat twice, adjusting for the first batch using the second
> > batch as a covariate, and then adjust for the second batch.
I can't include the second batch variable as a covariate; I get a
singularity error because the batches are confounded with each other.
For
example, all samples on chip A were run in batch 1. Do you still think
it a
valid approach if I can't use subsequent batch variables as
covariates?
Thank you for the offer & advice!
Magda