Dear List,
I have microrarray data where condition is completely confounded with
a time
batch effect. When doing a PCA on the RMA normalized data, the first
principal
component separates clearly the two batches.
What option do I have when I still want to compare different
conditions across
batches?
As far as I understood I can't dissolve this batch effect neither with
limma nor
with e.g. ComBat.
Shall I just go on with the limma analysis and keep in mind that some
genes just
might pop up due to batch effects?
best,
tefina
Hi Tefina,
It sounds like you are in a very difficult situation (i.e. if the
design is
such that batch separates condition exactly). If the condition is
nested
within batch or vice versa you may be able to minimize the effect, but
otherwise it may be a wash. If you can address an effect with limma,
usually it will be possible to address it more effectively with
ComBat, but
if you cannot separate batch from condition then even something like
SVA
won't work.
I don't suppose you have any replicates that were run within each
batch?
It's a long shot, but if you did, you could use those to estimate the
batch effects.
You might consider something like fRMA (instead of say RMA) to
leverage
other peoples' non-confounded designs. That's the best I can think
of;
hopefully one of the authors will happen upon your post and comment
whether
this makes sense given your situation. I would say something about
experimental design but obviously you are well aware of its importance
given your question. Hopefully you can use fRMA or tech reps to
salvage
it.
Best of luck,
--t
On Thu, Mar 8, 2012 at 2:56 AM, tefina <tefina.paloma@gmail.com>
wrote:
> Dear List,
>
> I have microrarray data where condition is completely confounded
with a
> time
> batch effect. When doing a PCA on the RMA normalized data, the first
> principal
> component separates clearly the two batches.
>
> What option do I have when I still want to compare different
conditions
> across
> batches?
> As far as I understood I can't dissolve this batch effect neither
with
> limma nor
> with e.g. ComBat.
> Shall I just go on with the limma analysis and keep in mind that
some
> genes just
> might pop up due to batch effects?
>
> best,
> tefina
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor@r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
--
*A model is a lie that helps you see the truth.*
*
*
Howard
Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf="">
[[alternative HTML version deleted]]
You might be able to salvage some of the biology, depending on the
microarray that you used. If there are enough control probes on the
array,
you could use something like Surrogate Variable Analysis on the
control
probe values. However, do look into the types of control probes being
used
(they don't just have to be negative control probes). The idea would
be
that some portion of the differences across samples at these probes
could
capture some of the batch effect (given you wouldn't expect the
experiment
condition to be correlated with the control probe values). Then you
could
adjust for the surrogate variables generated by these control probes
in
your analysis of the regular probes.
-Andrew
[[alternative HTML version deleted]]
Dear Tefina,
Since condition and time batch are completely confounded, strictly
speaking there is no way do distinguish between the affect of the
condition and the batch. So no software package can help.
However, if you have negative controls, i.e. genes which should not be
affected by condition (including manufacturer's control genes) you can
try
to use them to estimate (and at least partly remove) batch effect.
Best regards,
Moshe.
> Dear List,
>
> I have microrarray data where condition is completely confounded
with a
> time
> batch effect. When doing a PCA on the RMA normalized data, the first
> principal
> component separates clearly the two batches.
>
> What option do I have when I still want to compare different
conditions
> across
> batches?
> As far as I understood I can't dissolve this batch effect neither
with
> limma nor
> with e.g. ComBat.
> Shall I just go on with the limma analysis and keep in mind that
some
> genes just
> might pop up due to batch effects?
>
> best,
> tefina
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
______________________________________________________________________
The information in this email is confidential and
intend...{{dropped:4}}
We clearly need more information about your experiment to help you
with
removing batch effects. Judging from your original email, it sounded
like
all of condition 1 was on batch 1 and all of condition 2 was run on
batch 2
(ie this is perfect confounding). What exactly was run on Batch 1
(experiment 1) and what was run on Batch 2 (experiment 2)?
In your approach outlined below, you might still miss a lot, or have a
lot
of false positives. But its hard to know without your experiment
design.
Also depending on your experiment, if you're working in a well
understood
system, like yeast or stem cells, biologists could probably give you a
lists genes that should change with a given treatment. you could see
the
variation across batches at these genes to see if SVA or ComBat is
removing
real biology
-Andrew
Message: 14
Date: Mon, 12 Mar 2012 10:09:54 +0000
From: tefina <tefina.paloma@gmail.com>
To: <bioconductor@stat.math.ethz.ch>
Subject: Re: [BioC] batch effect confounded with condition
Message-ID: <loom.20120312t104842-37@post.gmane.org>
Content-Type: text/plain; charset="us-ascii"
Thanks for all your answers.
I think I will pursue the following strategy:
I will analyse each experiment separately. Contrasts within each
experiment
should be fine. (As batch 1 concerns only experiment 1 and batch 2
concerns
only
experiment 2).
Any comparisons between experiments ( = comparisons between batches) I
will
only
do on the p value level. So I will only evaluate things like: do genes
that
show
up in experiment 1 also show up in experiment 2?
Doing this, I should be more or less on the safe side.
What do you think?
[[alternative HTML version deleted]]
Thanks for all your answers.
I think I will pursue the following strategy:
I will analyse each experiment separately. Contrasts within each
experiment
should be fine. (As batch 1 concerns only experiment 1 and batch 2
concerns only
experiment 2).
Any comparisons between experiments ( = comparisons between batches) I
will only
do on the p value level. So I will only evaluate things like: do genes
that show
up in experiment 1 also show up in experiment 2?
Doing this, I should be more or less on the safe side.
What do you think?
I have 2 conditions (lets call them conditon A and B) with 3 biolog.
replicates
each in batch 1.
Additionally, I have 2 other conditions (conditions C and D) with 3
biolog.
replicates in batch 2.
Questions of interest are:
- diff. expr. genes between A vs B, and C vs D - this can be done via
LIMMA
- common regulation of genes comparing AvsB and CvsD
Obviously, the second question is the more difficult one. At the
moment, I would
just compare the lists of diff. expr. genes on the p-value level.