Question

DEseq2 design for knockdown conditions and treatment

0

Entering edit mode

j.siefert • 0

@jsiefert-21921

Last seen 5.6 years ago

I have a question about the proper design for DEseq2, when there is an siControl as well as siRNA for a target gene, then there is a treated and a vehicle for both conditions.

I setup the factors and design like so:

> dds$condition <- factor(dds$condition, levels =c("siControl","siTarget"))
> dds$condition <- relevel(dds$condition, ref= "siControl") 
> dds$treatment <- factor(dds$treatment, levels = c("vehicle","treated")) 
> dds$treatment <- relevel(dds$treatment, ref ="vehicle”) 
> design(dds) <- ~ condition + treatment + condition:treatment

At first it appeared to be giving me what I thought I wanted, differences in the siTarget population upon treatment, controlling for the siControl and Vehicle condition. But then I realized that the target gene is not on the list of DE genes, even though there is a clear knockdown. I think this is because the target gene is also down in the Vehicle condition.

I can compare just siTarget vs siControl, but then this list is dominated by genes that are different in the Vehicle condition. I can also compare treated vs. vehicle, but then there is a combination of siControl and siTarget genes in the list.

Ideally what I would like to do is normalize the treated condition to the untreated condition, and then look for differences between siControl and siTarget.

Is this possible with DESeq? Is there something wrong with this approach? Is there a better way to do this analysis?

Any help or advice would be very much appreciated.

deseq2 • 1.7k views

ADD COMMENT • link updated 5.6 years ago by swbarnes2 ★ 1.4k • written 5.6 years ago by j.siefert • 0

score 0 · Answer 1 · 2019-09-17

0

Entering edit mode

Michael Love 43k

@mikelove

Last seen 1 day ago

United States

One comment: you don't need to use relevel above, if you specify the levels when you call factor. The second and fourth line aren't doing anything additional, because you already specified siControl and vehicle to be reference levels.

Can you describe what pattern you are looking for in the results? I can't tell whether you want to control for the treated vs vehicle differences or not. If you want to control for this difference, then the above design is correct, and your assay for the target gene is not working because you are not seeing a difference that is significantly different than what you see in treated vs vehicle. Am I missing something? Have you looked at plotCounts for your target?

ADD COMMENT • link 5.6 years ago Michael Love 43k

0

Entering edit mode

Thanks for the quick reply and the comment. I didn't realize that the order of the factors specified the reference.

What I would like to find is genes that are differentially expressed in the siTarget treated vs. the siControl treated. I know that there will be a lot of differentially expressed genes between the vehicle and treated samples, in both the siControl and the siTarget. I'm interested in how knockdown of the target gene influences the expression of these genes. However, I realized that the knockdown might effect expression of these genes even in the vehicle condition. So by controlling for the differences between the siControl and the siTarget vehicle condition, I am losing many of the differentially expressed genes influenced by the the target gene. But if I don't control for this, then the list is dominated by genes that are different in the vehicle condition and its harder to see those that are differentially expressed in the treated condition.

Essentially I would like the vehicle condition to be the baseline for both the siControl and the siTarget, and then find differentially expressed genes between the treated condition. That's why I was thinking that if the treated samples could be normalized to the vehicle samples, then the differences between the siControl and the siTarget would become more apparent.

I did look at plotCounts and the target gene is depleted in both the vehicle and the treated conditions, which is why it's not significantly different. The problem is, I have the same effect for the genes under the influence of my target gene. I couldn't figure out how to put an image on here, but here are the counts:

> sample    count   condition   treatment   replicate
> wz1024    758.7739    siControl   vehicle 1
> wz1025    1276.4789   siControl   treated 1
> wz1026    342.5122    siTarget    vehicle 1 
> wz1027    551.4876    siTarget    treated 1
> wz1032    661.482 siControl   vehicle 2
> wz1033    1417.0398   siControl   treated 2
> wz1034    363.0176    siTarget    vehicle 2 
> wz1035    459.1506    siTarget    treated 2

You can see from this that upon treatment the gene levels go up, in both the siControl and the siTarget. The gene levels are reduced in the siTarget, in both the vehicle and the treated. What I am mainly interested in quantifying is how much gene expression is changed in the siTarget treated vs the siControl treated.

ADD REPLY • link 5.6 years ago j.siefert • 0

0

Entering edit mode

I would like the vehicle condition to be the baseline for both the siControl and the siTarget, and then find differentially expressed genes between the treated condition

I'm still lost. Can you express your quantity of interest as a ratio of these terms: {siC-V, siC-T, siT-V, siT-T} ?

ADD REPLY • link 5.6 years ago Michael Love 43k

0

Entering edit mode

I think I'm trying to do this:

{ siC-T/siC-V : siT-T/siT-V }

ADD REPLY • link 5.6 years ago j.siefert • 0

0

Entering edit mode

That's the design you are using. The null is that this equals 1.

ADD REPLY • link 5.6 years ago Michael Love 43k

0

Entering edit mode

That's what I was afraid of. Ok, thanks Michael

ADD REPLY • link 5.6 years ago j.siefert • 0

0

Entering edit mode

Sorry, one more quick question. If I wanted to look at { siC-T/siC-V : siT-T/siC-V } , and ignore the differences in the siT-V sample, what would be the design for that?

ADD REPLY • link 5.6 years ago j.siefert • 0

0

Entering edit mode

(siC-T / siC-V) / (siT-T / siC-V) = siC-T / siT-T

This comes up every now and then, but there isn't really any meaning to comparing A and B relative to C. A/C / B/C is just A/B.

ADD REPLY • link 5.6 years ago Michael Love 43k

0

Entering edit mode

Ah yes, of course. Ok, thanks

ADD REPLY • link 5.6 years ago j.siefert • 0

score 0 · Answer 2 · 2019-09-17

You really need to also state how you called results, for anyone to know what results you are extracting.

You seem to be asking for two different things. If you really want genes where (control-treated/control-vehicle)/ (target-control/target-vehicle) is far from 1, then the interaction term is the way to go.

But if you want

What I would like to find is genes that are differentially expressed in the siTarget treated vs. the siControl treated.

What's simpler is to make a new column that is condition.treatment, and compare target.treated to control.treated. You do not need to make a subsetted dds of just these samples to do this. (This can be done with the interaction term in the design and some cleverness in the results call, but the condition.treatment way is much easier to follow)

If you want to compare all the targets to all the controls, while having the software understand that vehicle vs treated matters, and explains some of the variance, and should be modeled as such, you'd do ~ treatment + condition

That's why I was thinking that if the treated samples could be normalized to the vehicle samples,

This is not how DESeq works. You don't 'normalize' like that. What you can do is add treatment to the design, so it will be incorporated into the model.