differentail expression analysis with interaction terms
Entering edit mode
Assa Yeroslaviz ★ 1.5k
Last seen 5 weeks ago

Hi, I'm working with a data set with three time points (1d, 5d, 10d and four treatments + ctrl treat1, treat2, treat3, treat4). For each of the 15 combinations I have triplicates (in total 45)

If I understood it correctly both edger and deseq2 works with this interactions terms to combine multiple factors (They use different commands, but the interactions are similar). In this case the full model would be (~Treat + day + Treat:day) and the reduced model (~Treat + Time).

To take the example from the edger manual's contrast matrix - what would be the difference between this two contrasts?

DrugvsPlacebo.0h = Drug.0h-Placebo.0h,
DrugvsPlacebo.1h = (Drug.1h-Drug.0h)-(Placebo.1h-Placebo.0h),

If I want to test for changes between treat and ctrl for each TP should I use the first contrast and do this (after combining the columns treatment and day from the sample information table:

treat1vsWT.1d = treat1.1d-WT.1d
treat2vsWT.1d =  treat2.1d-WT.1d
treat3vsWT.10d =  treat3.10d-WT.10d

which would give me 12 different pair-wise comparisons.

But what is different in the second contrast in the example above?

Another question is what would happen, if I use the given full and reduced model to get this

design <- model.matrix(~Treat + Treat:day, data=sampleInfo)

Now i will have many coefficients. I f I'm looking for genes changing over all time points, I would combine the coefficients into one vector. Let's say I would like to find all genes that significantly changed between the control and treat1 on all days. Would this be the correct syntax?

qlf <- glmQLFTest(fit, coef=c(7,12,17))

Would This give me the genes changed over all time points? Does this mean these genes are significantly changed in all time points independently?


edger deseq2 interaction design matrix • 706 views
Entering edit mode
Last seen 1 day ago
United States

With that many different combinations the number of interactions gets so large that trying to get a high level assessment of what is going on becomes (IMO) almost impossible. You would probably be better off using a spline fit and testing for the interaction that way. If there are lots of genes, you can then use something like k-means to cluster the genes that have a significant interaction between the spline and treatment and present results as sets of genes that react similarly, over time, to a given treatment, which is much easier to explain than trying to get a cohesive picture from 15 different interactions.


Login before adding your answer.

Traffic: 375 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6