Question

DEseq2 coefficient

0

Entering edit mode

JKim • 0

@4035f8c1

Last seen 7 weeks ago

United States

Hello,

I apologize if this question was answered many times before. I'm a little bit confused by DEseq2 coefficient. I have two factors, genotype (WT, KO) and batch (B1, B2, B3) variables and would like to see gene expression between WO and KO while accounting for batch effect. I came across a similar post and I'm just not sure if pulling coefficient by results(dds, contrast=c("genotype","KO","WT")) would be genotype effect within Batch B1 (reference level) or overall difference between genotype. i.e. sample 1~6 (WT) vs sample 7 ~ 12 (KO).


library(DESeq2)
library(tidyverse)

dds <- makeExampleDESeqDataSet(n = 1000, m = 12, betaSD = 2)
dds$genotype <- factor(rep(c("WT", "KO"), each = 6))
dds$genotype <- relevel(dds$genotype, "WT")
dds$batch <- factor(rep(c("B1", "B2", "B3"), 4))
dds$batch <- relevel(dds$batch, "B1")

colnames(dds) <- paste0("sample", 1:ncol(dds))

design(dds) <- ~1 + batch + genotype
dds <- DESeq(dds)
resultsNames(dds)
mod_mat <- model.matrix(design(dds), colData(dds))

results(dds, contrast=c("genotype","KO","WT"))

> mod_mat
         (Intercept) batchB2 batchB3 genotypeKO
sample1            1       0       0          0
sample2            1       1       0          0
sample3            1       0       1          0
sample4            1       0       0          0
sample5            1       1       0          0
sample6            1       0       1          0
sample7            1       0       0          1
sample8            1       1       0          1
sample9            1       0       1          1
sample10           1       0       0          1
sample11           1       1       0          1
sample12           1       0       1          1

DESeq2 coefficient • 891 views

ADD COMMENT • link 10 months ago • updated 8 months ago JKim • 0

score 1 · Accepted Answer · 2024-02-26

1

Entering edit mode

Michael Love 43k

@mikelove

Last seen 22 hours ago

United States

With standard coding in R, and no interaction term, it's across all batches.

If you add an interaction, the interpretation changes though to just the reference level of any controlling covariate. This can be visualized with ExploreModelMatrix, a package by some Bioconductor developers for helping users understand designs.

ADD COMMENT • link 10 months ago Michael Love 43k

0

Entering edit mode

Thanks a lot, Dr. Love.

ADD REPLY • link 10 months ago JKim • 0

0

Entering edit mode

The key point to remember about designs with interaction terms is that, unlike for a design ~genotype + condition, where the condition effect represents the overall effect controlling for differences due to genotype, by adding genotype:condition, the main condition effect only represents the effect of condition for the reference level of genotype (I, or whichever level was defined by the user as the reference level). The interaction terms genotypeII.conditionB and genotypeIII.conditionB give the difference between the condition effect for a given genotype and the condition effect for the reference genotype. From deseq2 - vignettes, interaction

I'm sorry. I should've paid more attention to the vignettes.

ADD REPLY • link 9 months ago JKim • 0

0

Entering edit mode

Hi Dr. Love,

I'm sorry to ask another question to my old post.

If the design isn't balanced, the interpretation of coefficients will be different? For example, does this contrast, results(dds, contrast=c("genotype","KO","WT")), mean the difference between KO and WT for the reference level of B1 using the code below?

(7 WT samples and 6 KO samples as well as 5, 4 and 4 samples for B1, B2 and B3.)

library(DESeq2)

dds <- makeExampleDESeqDataSet(n = 1000, m = 13, betaSD = 2)
dds$genotype <- factor(sample(c("WT", "KO"), size = 13, replace = TRUE))
dds$genotype <- relevel(dds$genotype, "WT")
dds$batch <- factor(sample(c("B1", "B2", "B3"), size = 13, replace = TRUE))
dds$batch <- relevel(dds$batch, "B1")

colnames(dds) <- paste0("sample", 1:ncol(dds))

design(dds) <- ~1 + batch + genotype
dds <- DESeq(dds)
resultsNames(dds)
mod_mat <- model.matrix(design(dds), colData(dds))

results(dds, contrast=c("genotype","KO","WT"))

> mod_mat
         (Intercept) batchB2 batchB3 genotypeKO
sample1            1       0       0          0
sample2            1       0       1          0
sample3            1       0       1          1
sample4            1       1       0          1
sample5            1       0       0          0
sample6            1       0       0          1
sample7            1       1       0          0
sample8            1       0       1          0
sample9            1       0       0          0
sample10           1       0       1          1
sample11           1       1       0          0
sample12           1       1       0          1
sample13           1       0       0          1

> dds$batch |> table()

B1 B2 B3 
 5  4  4 


> dds$genotype |> table()

WT KO 
 7  6

ADD REPLY • link 8 months ago JKim • 0