Hello,
I apologize if this question was answered many times before. I'm a little bit confused by DEseq2 coefficient. I have two factors, genotype (WT, KO) and batch (B1, B2, B3) variables and would like to see gene expression between WO and KO while accounting for batch effect. I came across a similar post and I'm just not sure if pulling coefficient by results(dds, contrast=c("genotype","KO","WT"))
would be genotype effect within Batch B1 (reference level) or overall difference between genotype. i.e. sample 1~6 (WT) vs sample 7 ~ 12 (KO).
library(DESeq2)
library(tidyverse)
dds <- makeExampleDESeqDataSet(n = 1000, m = 12, betaSD = 2)
dds$genotype <- factor(rep(c("WT", "KO"), each = 6))
dds$genotype <- relevel(dds$genotype, "WT")
dds$batch <- factor(rep(c("B1", "B2", "B3"), 4))
dds$batch <- relevel(dds$batch, "B1")
colnames(dds) <- paste0("sample", 1:ncol(dds))
design(dds) <- ~1 + batch + genotype
dds <- DESeq(dds)
resultsNames(dds)
mod_mat <- model.matrix(design(dds), colData(dds))
results(dds, contrast=c("genotype","KO","WT"))
> mod_mat
(Intercept) batchB2 batchB3 genotypeKO
sample1 1 0 0 0
sample2 1 1 0 0
sample3 1 0 1 0
sample4 1 0 0 0
sample5 1 1 0 0
sample6 1 0 1 0
sample7 1 0 0 1
sample8 1 1 0 1
sample9 1 0 1 1
sample10 1 0 0 1
sample11 1 1 0 1
sample12 1 0 1 1
Thanks a lot, Dr. Love.
I'm sorry. I should've paid more attention to the vignettes.
Hi Dr. Love,
I'm sorry to ask another question to my old post.
If the design isn't balanced, the interpretation of coefficients will be different? For example, does this contrast,
results(dds, contrast=c("genotype","KO","WT"))
, mean the difference between KO and WT for the reference level ofB1
using the code below?(7 WT samples and 6 KO samples as well as 5, 4 and 4 samples for B1, B2 and B3.)