Hi Julia,
Please don't take conversations off list (e.g., use Reply-All to
respond).
On Wed, Sep 3, 2014 at 2:18 AM, Pickl, Julia <j.pickl at="" dkfz-="" heidelberg.de="">
wrote:
> Hi Jim,
>
> could you please tell me, why contrast 3 and 4 are not valid
contrasts? I
> do not understand it completely. Is it because the same amount of
factors
> should have +1 and -1 in the contrast matrix?
>
>
They aren't valid contrasts because the coefficients don't add up to
zero.
You use the contrast to form a t-statistic, which you then use to test
a
null hypothesis versus an alternative hypothesis.
In general, the null hypothesis is that the numerator of the
t-statistic is
equal to zero, and the alternative hypothesis is that the numerator is
not
equal to zero (depending on the alternative you can also test that the
numerator is greater or less than zero). Because of this, the
coefficients
of the contrast have to add up to zero (or else you aren't testing the
null
that the numerator equals zero).
So if we look at your contrast 3, you have
IP.treat - IP.control - IgG.treat
Now remember, ANOVA is simply algebra. You could be hypothesizing that
IP.treat - IP.control - IgG.treat = 0, which would imply that
IP.control
and IgG.treat somehow sum to be equal to IP.treat. But that is a weird
sort
of null hypothesis (and from a biological perspective, why would you
think
that would be true?). One would usually assume that under the null,
there
are no differences between any of those three groups. In which case
this
contrast would be testing that IP.treat - IP.control - IgG.treat = -1,
which is certainly a valid thing to test, I suppose, but what would it
mean
to reject that null hypothesis? There are any number of ways that
those
three coefficients could add up to something different from -1, so it
isn't
clear what you are testing here.
> From a biological point of view I have still problems with the
contrast 1
>
> (IP.treat-IgG.treat)-(IP.control-IgG.control),
>
> as it is also
>
> IP.treat ? IgG.treat ? IP.control *+* IgG.control
>
> And this looks like the counts of IP.treat* plus* IgG.control are
> compared to IgG.treat and IP.control.
>
>
And that is another interpretation for that contrast. This is why the
associative law is useful; you can move things around in such a way to
make
interpretation of the result easier (or harder, if you so desire).
There are two things to consider. First, you want to set up both your
coefficients and any contrasts in such a way that you can most easily
interpret the results. In this case, setting up the contrast as (and
thinking of the contrast in terms of) (IP.treat -
IP.control)-(IgG.treat -
IgG.control) is easiest.
This is because you can then formulate the null hypothesis as
firstpart - secondpart = 0
or alternatively
firstpart = secondpart
which means that the difference between IP.treat and IP.control is
equal to
the difference between IgG.treat and IgG.control, which you can then
interpret as meaning that the IP results are indistinguishable from
the IgG
results. And since that is a useful null hypothesis, given the
experiment,
it is best to interpret the contrast that way.
The second issue has to do with rejecting the null hypothesis, and
what
that means. For a simple contrast, interpreting a rejected null
hypothesis
is simple. Say you tested
IP.treat - IP.control
and you reject the null with a p < 0.05, and the t-statistic has a
value of
13.4. It's easy then to say that there appears to be a difference
between
those two samples, and it is also easy to see that the treatment
results in
way more of the given gene being pulled down by the IP step (because
the
t-statistic has a positive sign, implying a positive fold change,
which can
only come about if the IP.treat coefficient is larger than the
IP.control
coefficient).
But if you get a p < 0.05 and a t-statistic of 13.4 for the
interaction
term (IP.treat - IP.control - IgG.treat + IgG.control), then how do
you
interpret that result? With just the t-statistic (or even the log fold
change) all you can say is that there is a difference between
treatment and
control that is dependent on whether or not you used the IP antibody
or
non-specific IgG. But this result can arise in any number of ways, and
you
need to explore the data further to see exactly what is going on, by
e.g.
plotting the logCPM values by group.
Best,
Jim
>
> Thank you for your help!
>
> Best wishes,
>
> Julia
>
>
>
>
>
>
>
>
>
>
>
> *Von:* James W. MacDonald [mailto:jmacdon at uw.edu]
> *Gesendet:* Dienstag, 2. September 2014 16:35
> *An:* Julia [guest]
> *Cc:* bioconductor at r-project.org; Pickl, Julia
> *Betreff:* Re: [BioC] edgeR contrast
>
>
>
> Hi Julia,
>
>
>
> This appears to be a ChIP-Seq experiment, in which case I wouldn't
analyze
> it this way. Instead, I would use something like MACS to call peaks,
using
> the IgG fractions as the 'input' fraction. In other words, the IgG
fraction
> is used to help distinguish real IP regions from those regions that
have
> high sequencing depth due to technical factors. You would then use
edgeR to
> compare IP.treat versus IP.control. This is not a trivial analysis,
and you
> should look at this paper
(*
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4066778/
> <http: www.ncbi.nlm.nih.gov="" pmc="" articles="" pmc4066778=""/>) *for more
> information on how you should normalize your counts. But maybe I
completely
> misunderstand the experiment.
>
>
>
> In that case, back to your question. contrasts 3 and 4 aren't valid
> contrasts, so you should just ignore those results. Contrast 1 is an
> interaction contrast, and is testing for genes that have different
amounts
> of IP binding between treatment and control, after adjusting for
> non-specific binding. If this isn't ChIP-Seq, but instead is some
> transcript binding experiment, then this is likely the contrast you
want.
>
>
>
> Best,
>
>
>
> Jim
>
>
>
>
>
>
>
> On Tue, Sep 2, 2014 at 6:42 AM, Julia [guest] <guest at="" bioconductor.org="">
> wrote:
>
> I try different contrasts with edgeR to get a feeling for my data
and also
> to find out the best contrast for my question. I would like to know
what
> genes are enriched in IP.treat compared to IP.control, both adjusted
for
> unspecific IgG binding.
> So it seems like contrast 1 is the best:
> (IP.treat-IgG.treat)-(IP.control-IgG.control), however it seems like
> IgG.control is added to IP.treat as ??? and ??? is +. I then tried
contrast
> 3 and 4, and get totally different results with genes only FC>1.
> My question: Is it allowed to have more levels -1 than +1 or how can
it be
> explained that contrast3 and 4 look very similar but totally
different than
> contrast1 (and IP)?
>
>
> Levels IP contrast1 contrast2 contrast3 contrast4
contrast5
>
> IP.treat 1 1 1 1 1
0
>
> IP.control -1 -1 -1 -1 -1
0
>
> IgG.treat 0 -1 0 -1 -1
1
>
> IgG.control 0 1 0 -1 0
-1
>
>
>
>
>
> Thanks for any help.
>
> Julia
>
> -- output of sessionInfo():
>
> .
>
> --
> Sent via the guest posting facility at bioconductor.org.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
>
https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
>
http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>
>
>
> --
>
> James W. MacDonald, M.S.
> Biostatistician
> University of Washington
> Environmental and Occupational Health Sciences
> 4225 Roosevelt Way NE, # 100
> Seattle WA 98105-6099
>
>
>
>
>
> --
>
> James W. MacDonald, M.S.
> Biostatistician
> University of Washington
> Environmental and Occupational Health Sciences
> 4225 Roosevelt Way NE, # 100
> Seattle WA 98105-6099
>
--
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099
[[alternative HTML version deleted]]
I'm a bit late to the party, but if this is a ChIP-seq analysis, you might consider giving the
csaw
package a try. This performs a de novo differential binding analysis, by counting reads into sliding windows and analyzing them withedgeR
. For sharp binding, this may be more appropriate than counting over genes, which is what you seem to be doing.