Search
Question: deseq2 analysis with multiple factors and interaction terms
0
5 months ago by
hsbio0
hsbio0 wrote:

Hi,

I am trying to do some differential expression analysis with deseq2 at the moment and I have samples with 3 different groups; genotype (A and B), treatment (control and treated), time (1hr vs 30). Each set of conditions has at least 3 replicates. I have tried to have a read through previous threads and noticed that one of the suggestions is grouping all these factors together which I have done and it provided some useful comparisons, however I would also like to try and do analysis using interaction terms and I had a few questions about this.

One of the main aspects I want to look at it is whether the treatment effect is different over genotype as well as generally looking at genotype differences. I've had a look at ?results and I think I understand how this is done with 2 different groups but wasn't sure with 3, how my design should be?

Would a design of ~ genotype + treatment + time + genotype:treatment be reasonable here?

Considering this design would then passing these arguments to results give me:

contrast=c("genotype","B","A") -- give me the differences due to genotype taking into account any differences due to treatment/time?

contrast=c("treatment","treated","control) -- give me the differences due to treatment overall or only at genotype A?

list(c("treatment_treated_vs_control","genotypeB.treatmenttreated")) -- give me just the differences in the treatment effect between genotype B vs genotype A? or just the total treatment effect for genotype B?

name="genotypeB.treatmenttreated" -- give me the difference in the treatment effect between genotype B vs genotype A.

Sorry, I know a lot of this is covered in the vignette/?results but I just wanted to make sure with 3 factors. I was also wondering how the time aspect would play into these comparisons, and if I should be adding another interaction term to my design (treatment:time) as while the control treatment shouldn't change over time, the treated samples should have a difference between the 2 time points.

Thanks!

modified 5 months ago by Michael Love17k • written 5 months ago by hsbio0
0
5 months ago by
Michael Love17k
United States
Michael Love17k wrote:

Re: "whether the treatment effect is different over genotype" I like to start things off by asking how you want time to be included. One possibility is to consider a 30 min treatment effect and a 1 hour treatment effect and look at how these two effects are different over genotype. I'm assuming you have a full design, so 2 x 2 x 2 = 8 combinations of the 3 factors with 3 replicates each = 24 samples.

Hi Micheal,

Thanks for the response. I think that would be one way I would want to look at it. Would it also be a possibility to say look at whether the differences over time, say 1hr vs 30 min are different over genotypes. Perhaps this would over-complicate it a bit.

And yes, some of the combinations have more than 3 replicates but each of the 8 combinations do have at least 3 replicates.

Thanks for the help!

1

hi,

If you code treatment so that it has three levels, "control","time30m", and "time60m", you can use a design, ~genotype + time + genotype:treatment.

This will provide four interaction terms for genotype x the two treatment effects, and you can contrast these using the 'list' style of contrast in the results() function.

"whether the differences over time, say 1hr vs 30 min are different over genotypes"

You'd need to have a more specific question than this, because this doesn't mention treatment at all.

Hi,

Thanks for that suggestion.

I've done that and would just like to confirm a few things. In this design:

results(dds,name="genotypeA.treatmenttreated1hr") would essentially be the treatment effect at 1 hr in genotype A or (treatment at 1 hr vs control treatment in samples that are genotype A)?

results(dds,contrast=list("genotypeA.treatmenttreated1hr","genotypeB.treatmenttreated1hr")) would be comparing the treatment effect (1hr) between genotype A and genotype B or essentially (treated 1hr vs control treatment in genotype A) vs (treated 1 hr vs control treatment in genotype B). Is that correct?

Also, is it reasonable to use this design for the purposes of getting this information i.e. whether the treatment effect is different over genotype and use a design where all the factors are merged in a ~group design when wanting to do comparisons between specific groups (i.e. genotypeAtreated1hr vs genotypeAtreated30 or genotypeAcontrol vs genotypeBcontrol). Or, is there a way to do both in one for purposes of consistency etc?

Thanks again, your help is much appreciated.

Yes, that is the treatment effect at 1 hour in genotype A. Note, it is comparing to control samples that are genotype A at 1 hour.

Yes, that is the A vs B contrast for the treatment effect at 1 hour.

"Also, is it reasonable to use this design for the purposes of getting this information i.e. whether the treatment effect is different over genotype..."

I'm confused, because I feel like the above contrast already answers this question for each time point, and while taking into account the appropriate control samples, which is key.

If you recognize all the coefficients in the model, you can build up any comparison between two groups of samples by adding together coefficients with results() and 'contrast'. So there isn't the need to run a new model with ~group. You might want to sit down with a statistician who can explain what the terms in the linear model are doing. Or you can use ~group and do pairwise comparisons this way.