Question

What is the correct contrast matrix if one treatment must be compared to both of the other two treatments, but separately?

1

Entering edit mode

Peter • 0

@peter-11031

Last seen 5.2 years ago

United Kingdom

I have RNA-seq data from a 3-level 1-factorial experiment: non-treated control, placebo-treated negative control, and treated cells. I have 3 replicates for each.

After running edgeR, and also looking at expression levels, I noticed that the negative control is not a good control, as it has DE genes compared to non-treated, while the treatment is not DE for these genes (has the same expression levels).

So to be safe, I want to find genes that are DE compared to non-treated and also for negative control.

How should the contrast matrix be designed?

I suspect that the solution is to simply take the intersect of separately calculated DE gene lists (instead of building it into the contrast).

My design matrix:
groups <- factor(c(0,0,0,1,1,1,2,2,2)) design <- model.matrix(~ 0 + groups) colnames(design) <- c("O", "N", "T")

edger limma model.matrix contrast matrix design matrix • 1.3k views

ADD COMMENT • link updated 7.8 years ago by Gavin Kelly ▴ 680 • written 7.8 years ago by Peter • 0

1

Entering edit mode

Gavin Kelly ▴ 680

@gavin-kelly-6944

Last seen 4.0 years ago

United Kingdom / London / Francis Crick…

I agree with Ryan's support of the 'intersection' approach you suggested, in cases where you wish your effect to be distinguishable from both types of control, but you'll be subject to two rounds of exposure to statistical errors, with the subsequent loss of power. I think it might be worth looking at other approaches also, depending on the definition of 'placebo' here. If the placebo is inducing some partially confounding effect (a scrambled siRNA vector,...) then that is the true control, as your three conditions are then O=baseline, N=baseline+placebo_effect, T=baseline+placebo_effect+biological effect, and N-T gives you the straight biological effect, and you can effectively ignore the untreated group (apart from it's contribution to the estimation of within-group noise). Eliminated things that don't have an O vs T significant effect could in this situation decreases your power and potentially introduces bias against genes where the biological effect is working to counteract the placebo effect (only the experimenter will know if a gene that is upregulated in response to placebo, and reverts to baseline on full treatment is interesting or not).

At the other end of the spectrum, you could actually pool your two types of control groups <- factor(c(0,0,0,0,0,0,2,2,2)) so that you'd be including any placebo effect as part of the 'replicate' variability, and anything that survives this increase in variability is a sufficiently large biological effect that it dwarfs any placebo-induced variability.

When I get vaguely-specified controls, I tend to apply all three approaches, and draw the expression profiles of genes that are significant in some but not all of the approaches - the experimenter then has a clear visualisation of the different hypotheses being tested.

ADD COMMENT • link 7.8 years ago Gavin Kelly ▴ 680

0

Entering edit mode

Thank you for the comment, it's true that there are two rounds of error, although I think a more profound loss arises from not including the genes that are not DE against both controls.

Also yes, there is the danger that the intersect approach ignores effects countering the placebo effect. However, the genes in question have basically the same expression level in O vs T, so that makes me doubting the above.

As for pooling, I'll probably try it, but the increased variability means I lose genes that are DE in T vs O or N, but also have big difference in N vs O. (Samples within the 3 groups have very similar expression.) And also those that have opposite fold changes -- although good question whether those should be included in the original intersect approach anyway.

ADD REPLY • link 7.8 years ago Peter • 0

score 2 · Accepted Answer · 2016-07-16

2

Entering edit mode

Ryan C. Thompson ★ 7.9k

@ryan-c-thompson-5618

Last seen 8 months ago

Scripps Research, La Jolla, CA

You are correct. The solution is to perform the individual contrasts separately and then take the intersection.

ADD COMMENT • link 7.8 years ago Ryan C. Thompson ★ 7.9k