Question

DEseq2 Single cell data paired design

0

Entering edit mode

Caro • 0

@9a17be10

Last seen 2.2 years ago

Singapore

```

Hi all

I have a single cell data experimental design which looks like this:

cluster_metadata

Sample_N   Condition    Treatment  Sample_ID Sample_Name
1            healthy        untreated          S1                 H1
2            healthy        treated              S1                 H2
3            healthy        untreated          S2                 H3
4            healthy        treated              S2                 H4
5            healthy        untreated          S3                 H1
6            healthy        treated              S3                 H2
7            healthy        untreated          S4                 H3
8            healthy        treated              S4                 H4
9            disease       untreated          S5                 DB1
10          disease       treated              S5                 DB2
11          disease       untreated          S6                 DB3
12          disease       treated              S6                 DB1
13          disease       untreated          S7                 DB2
14          disease       treated              S7                 DB3

I have 4 healthy individuals (S1,S2,S3,S4) and 3 individuals with disease (S5,S6,S7) which have been treated or not treated with a drug and I would like to determined the effect of the drug treament. It is a paired experimental design since the same individual has been treated or not treated with the drug. I have run some analysis and obtained a count matrix (cluster_counts) reached to the point where I have to design a formula:

dds <- DESeqDataSetFromMatrix(cluster_counts, 
                              colData = cluster_metadata, 
                              design = ~ sample_Name + Treatment )

However, this formula does not seem correct as I see wrong comparisons in the output.

After running :

dds <- DESeq(dds)
resultsNames(dds)

it gives:

> resultsNames(dds)
[1] "Intercept"                  "Sample_Name_DB2_vs_DB1"              "Sample_Name_DB3_vs_DB1"             
[4] "Sample_Name_H1_vs_DB1"               "Sample_Name_H2_vs_DB1"               "Sample_Name_H3_vs_DB1"              
[7] "ISample_Name_H4_vs_DB1"               "condition_treated_vs_untreated"

What is the correct formula in this experimental design? Thanks

R • 804 views

ADD COMMENT • link updated 2.2 years ago by Marek Gierlinski ▴ 30 • written 2.2 years ago by Caro • 0

score 0 · Answer 1 · 2023-04-26

The formula specified by design parameter should include relevant independent variables, in your case Condition and Treatment:

~ Condition + Treatment

or, if you suspect interaction between the variables,

~ Condition + Treatment + Condition:Treatment

Using sample name in the formula will not work. You are not asking how the result depends on the sample name, you are asking how the result depends on the condition and treatment.