**10**wrote:

Hello. I'm familiar with DESeq2 but the most complex design equations I've used are having blocked pairs. I need to perform a more complicated design but I'm having trouble formulating the equation. I have a design table that I've read in that contains all the relevant information. Here are the specifications:

condition = sample/control (what I'm testing over)

week = a timepoint comparison of when the sample was generated

grade = a comparison of what grade of disease the sample has (so 4 > 3 > 2 > 1 > 0 ie control)

There are also 3 possible organ sites but some samples are a combination, so I've split it into 3 binary variables: isSiteA, isSiteB, and isSiteC, all of which are 0 or 1

This is what the table looks like:

Sample | Condition | Week | Grade | isSiteA | isSiteB | isSiteC |
---|---|---|---|---|---|---|

SAMPA | sample | 12 | 3 | 1 | 0 | 0 |

SAMPB | control | 2 | 0 | 0 | 1 | 1 |

SAMPC | control | 4 | 0 | 0 | 1 | 0 |

Right now my equation is this:

ddsHTSeq <- DESeqDataSetFromMatrix(counts_table, sample_table, design = ~ week : grade : is_GI + is_Liver + is_Skin + condition)

Is this correct at all? I really appreciate any help. FYI I do have more than 3 samples, before anyone asks!

hi Alex,

This looks like more of a statistical analysis question than a specific DESeq2 software question (any design or terms you would test in DESeq2 would be the same for a normal linear regression). I'm pressed for time and working on software right now. I'll leave this as a comment and not an answer, so someone else may make suggestions, but if you don't get a response, I'd recommend talking to a local statistician at your institute who could help come up with the design and the terms to test.

24kI agree. This experimental design is complex enough that you're not going to be able to come up with the "one true design" from a single consultation with a statistician. Rather, you're going to need some back-and-forth dialogue to investigate which factors and interactions are important, and what the best way to model them is. This is why you should definitely find a local statistician who you can keep coming back to for help.

7.3k