Hi, I'm analyzing an RNAseq experiment with DeSeq2, but I have a problem. I have two groups (treated and untreated) composed by the same subjects. I would like to see if the treatment caused any RNA expression alteration, making a paired analysis.

My data file is structured with 5 columns:

Sample Treatment Subject Age Sex

1 Y A 31 M

2 N A 30 M

3 Y B 27 F

4 N B 25 F

5 Y C 47 M

5 N C 46 M ...

I would like to include all the variables in the DeSeq analysis; however, when I make ~ Age + Sex + Subject + Treatment, I got the message:

However I then get the error:

Error in DESeqDataSet(se, design = design, ignoreRank) : the model matrix is not full rank, so the model cannot be fit as specified. one or more variables or interaction terms in the design formula are linear combinations of the others and must be removed

How can I solve the issue?

Thank you

Sorry, I've forgotten to mention that I have a total of 46 samples (23 subjects). I don't understand why the message tells that "One or more variables or interaction terms in the design formula are linear combinations of the others and must be removed." and how I can avoid it.

It means that you have two or more variables that are nested with each other. You are performing a paired analysis so what you are effectively measuring is the difference between the two samples of the same patient, before and after that treatment. Think about it, if you are accounting for the Subject effect (=paired analysis) then Sex and Age is already accounted by that because Sex and Age are both factors that are part of the individual being measured here. So design

`~Subject+Treatment`

probably is what you want.I do not know you're entire dataset but from what I see it is what I explained. Subject and Sex have a relationship, each A will be M, each B will be F, each C will be M so the information of both columns is the same. You need to remove Sex or Subject from your design

Thank you for the reply. I understand and I agree. So, if I would like to investigate the effect of sex on gene expression, should I add a "nested fector", as for example treatment:sex? thank you again