Question

deseq2 analysis with multiple factors and interaction terms

0

Entering edit mode

hsbio ▴ 10

@hsbio-14446

Last seen 6.4 years ago

Hi,

I am trying to do some differential expression analysis with deseq2 at the moment and I have samples with 3 different groups; genotype (A and B), treatment (control and treated), time (1hr vs 30). Each set of conditions has at least 3 replicates. I have tried to have a read through previous threads and noticed that one of the suggestions is grouping all these factors together which I have done and it provided some useful comparisons, however I would also like to try and do analysis using interaction terms and I had a few questions about this.

One of the main aspects I want to look at it is whether the treatment effect is different over genotype as well as generally looking at genotype differences. I've had a look at ?results and I think I understand how this is done with 2 different groups but wasn't sure with 3, how my design should be?

Would a design of ~ genotype + treatment + time + genotype:treatment be reasonable here?

Considering this design would then passing these arguments to results give me:

contrast=c("genotype","B","A") -- give me the differences due to genotype taking into account any differences due to treatment/time?

contrast=c("treatment","treated","control) -- give me the differences due to treatment overall or only at genotype A?

list(c("treatment_treated_vs_control","genotypeB.treatmenttreated")) -- give me just the differences in the treatment effect between genotype B vs genotype A? or just the total treatment effect for genotype B?

name="genotypeB.treatmenttreated" -- give me the difference in the treatment effect between genotype B vs genotype A.

Sorry, I know a lot of this is covered in the vignette/?results but I just wanted to make sure with 3 factors. I was also wondering how the time aspect would play into these comparisons, and if I should be adding another interaction term to my design (treatment:time) as while the control treatment shouldn't change over time, the treated samples should have a difference between the 2 time points.

Thanks!

deseq2 multiple factor design differential analysis differential gene expression • 5.8k views

ADD COMMENT • link updated 8.3 years ago by Michael Love 43k • written 8.3 years ago by hsbio ▴ 10

score 0 · Answer 1 · 2017-11-22

0

Entering edit mode

Michael Love 43k

@mikelove

Last seen 1 day ago

United States

Re: "whether the treatment effect is different over genotype" I like to start things off by asking how you want time to be included. One possibility is to consider a 30 min treatment effect and a 1 hour treatment effect and look at how these two effects are different over genotype. I'm assuming you have a full design, so 2 x 2 x 2 = 8 combinations of the 3 factors with 3 replicates each = 24 samples.

ADD COMMENT • link 8.3 years ago Michael Love 43k

0

Entering edit mode

Hi Micheal,

Thanks for the response. I think that would be one way I would want to look at it. Would it also be a possibility to say look at whether the differences over time, say 1hr vs 30 min are different over genotypes. Perhaps this would over-complicate it a bit.

And yes, some of the combinations have more than 3 replicates but each of the 8 combinations do have at least 3 replicates.

Thanks for the help!

ADD REPLY • link 8.3 years ago hsbio ▴ 10

1

Entering edit mode

hi,

If you code treatment so that it has three levels, "control","time30m", and "time60m", you can use a design, ~genotype + time + genotype:treatment.

This will provide four interaction terms for genotype x the two treatment effects, and you can contrast these using the 'list' style of contrast in the results() function.

"whether the differences over time, say 1hr vs 30 min are different over genotypes"

You'd need to have a more specific question than this, because this doesn't mention treatment at all.

ADD REPLY • link 8.3 years ago Michael Love 43k

0

Entering edit mode

Hi,

Thanks for that suggestion.

I've done that and would just like to confirm a few things. In this design:

results(dds,name="genotypeA.treatmenttreated1hr") would essentially be the treatment effect at 1 hr in genotype A or (treatment at 1 hr vs control treatment in samples that are genotype A)?

results(dds,contrast=list("genotypeA.treatmenttreated1hr","genotypeB.treatmenttreated1hr")) would be comparing the treatment effect (1hr) between genotype A and genotype B or essentially (treated 1hr vs control treatment in genotype A) vs (treated 1 hr vs control treatment in genotype B). Is that correct?

Also, is it reasonable to use this design for the purposes of getting this information i.e. whether the treatment effect is different over genotype and use a design where all the factors are merged in a ~group design when wanting to do comparisons between specific groups (i.e. genotypeAtreated1hr vs genotypeAtreated30 or genotypeAcontrol vs genotypeBcontrol). Or, is there a way to do both in one for purposes of consistency etc?

Thanks again, your help is much appreciated.

ADD REPLY • link 8.3 years ago hsbio ▴ 10

0

Entering edit mode

Yes, that is the treatment effect at 1 hour in genotype A. Note, it is comparing to control samples that are genotype A at 1 hour.

Yes, that is the A vs B contrast for the treatment effect at 1 hour.

"Also, is it reasonable to use this design for the purposes of getting this information i.e. whether the treatment effect is different over genotype..."

I'm confused, because I feel like the above contrast already answers this question for each time point, and while taking into account the appropriate control samples, which is key.

If you recognize all the coefficients in the model, you can build up any comparison between two groups of samples by adding together coefficients with results() and 'contrast'. So there isn't the need to run a new model with ~group. You might want to sit down with a statistician who can explain what the terms in the linear model are doing. Or you can use ~group and do pairwise comparisons this way.

ADD REPLY • link 8.2 years ago Michael Love 43k

0

Entering edit mode

Hi Michael， I am doing similar analysis like him. But my question is I want to look at the differently expressed genes of treated/untreated samples between genotypeA/B. should I compute DEGs separately like: DEGs of genotypeA: treated/untreated, then DEGs of genotypeB: treated/untreated, at the end look at the difference between this two lists or use two-factor analysis and look at the “ interaction” which compares A/B:treated/untreated together?

ADD REPLY • link 7.0 years ago lihongfei93 • 0

0

Entering edit mode

You want an interaction test. Take at a look at the vignette section on interactions and the examples in ?results.

ADD REPLY • link 7.0 years ago Michael Love 43k