I have a following issue when running DESeq2 with a continuous effect variable.

In my experiment there are 2 time points and 2 groups. Each sample (individual) has a sample at timepoint 1 and timepoint 2, but belongs to only one of the groups.

dsgn <- "~1 + Batch + Group + Timepoint + Timepoint:Group + Timepoint:Group:X" dds <- DESeq2::DESeqDataSetFromMatrix( countData = countData, colData = colData, design = formula(dsgn) ) dds <- DESeq2::estimateSizeFactors(dds, "poscounts") dds <- DESeq2::DESeq(dds) resGAT0 <- results(dds, name = "GroupA.Timepoint0.X")

I am interested in the effect of a continuous variable X within each group A or B at each time point 0 or 1. I thought, the code above would achieve this. However, when inspecting results I found that many features (species) declared significant by DESeq2 (with a large logfold change) in the 'resGAT0' table had in fact all zero counts in the original data within that particular group A at time point 0. Is there some issue with the way I set up my DESeq2 analysis? In that case how to I obtain the actual effect for X within each of the 4 groups/time points combinations?

Is the issue related to the fact that the DESeq2::DESeq(dds) command returned a warning:

195 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest

Thanks you!

Another comment on this:

"the effect of a continuous variable X within each group A or B at each time point 0 or 1"The term you have pulled out is: whether the difference in the slope in log counts associated with X from time=0 compared to timepoint=1 is

itselfdifferent in group A vs group B. It's not exactly the same as the description of the term you are after. Another note is that it's typical to have e.g. time=0 and group=A as the reference level, but it seems you have it reversed.I might suggest that you combine time and group into a single factor, call that "group" and then use a design of

`~group:X,`

which will create a slope for each combination of time (0 or 1) and group (A or B as you have it).Thank you for a quick response!

I found that including the term Timepoint first would give me the correct ordering:

My full column names for the design matrix would be (DESeq2::resultsNames(dds)):

1) In the current notation then what should be the terms I need to pull out to get the effect (slope) of the X variable in each group?

2) I like your idea of creating a new 4 level category factor variable, Cat, e.g. Tmpt0_GroupA. In this way my design should be probably:

With this design I get:

This would produce for intercept coefficients and 4 X-slope coefficient the same way as the previous design. I assume in this case I can simply call e.g.

to obtain the logfolds?

How would doing the same look like in the previous notation?

Thank you!

Lan

hi,

"What should be the terms I need to pull out to get the effect (slope) of the X variable in each group?"- rather than answer this, which takes a long time, I would prefer you to use the design I recommended, which gives you four coefficients, one for the slope in each group, which is what you want. You can then pull them out with`results(dds, name="...")`

.Your code above isn't what I was suggesting, instead use:

If batch is not confounded with group you can add batch too:

`~batch + z:x`

Wouldn't this mean that all groups share the same intercept? I am interested in the effect (slope) of the variable x in each group, and each group should be allowed to have dissimilar mean (intercept).

You can add 'group' if it is not confounded with batch, to have different intercepts.