I am trying to analyze the RNAseq data using Salmon + DEseq2. I have two variables: genotype (WT or KO) + condition ( treatment A or B), as shown below:
>samples
# run genotype condition
#sample1 sample0001_quant WT A
#sample2 sample0002_quant WT B
#sample3 sample0003_quant KO A
#sample4 sample0004_quant KO B
#sample5 sample0005_quant WT A
#sample6 sample0006_quant WT B
#sample7 sample0007_quant KO A
#sample8 sample0008_quant KO B
#sample9 sample0009_quant WT A
#sample10 sample0010_quant WT B
#sample11 sample0011_quant KO A
#sample12 sample0012_quant KO B
I am interested in comparing mRNA expression with:
conditionA_WT_vs_KO, WT as the reference level.
conditionB_WT_vs_KO, WT as the reference level.
genotypeKO_A_vs_B, conditionA as the reference level.
After I read other threads, I thought that there are two ways to do that:
- combine the factors of interest into a single factor:
design(dds) <- ~ group
- just add the interactions:
design(dds) < ~genotype + condition + genotype:condition
I could not figure it out how to do the pairwise comparison using the second way, so I decided to use the group factor:
#Read the counts from salmon
library("tximport")
library("jsonlite")
library("readr")
txi <- tximport(files, type="salmon", tx2gene=tx2gene, ignoreTxVersion = TRUE)
#Analyze the data
library("DESeq2")
ddsTxi <- DESeqDataSetFromTximport(txi,
colData = samples,
design = ~ genotype + condition)
ddsTxi$group <- factor(paste0(ddsTxi$genotype, ddsTxi$condition))
design(ddsTxi) <- ~group
dds <- DEseq(ddsTxi)
conditionA_WT_vs_KO <- results(dds, contrast=c("group", "WTA", "KOA"))
conditionB_WT_vs_KO <- results(dds, contrast=c("group", "WTB", "KOB"))
genotypeKO_A_vs_B <- results(dds, contrast=c("group", "KOA", "KOB"))
So,
- Did I compare the samples correctly as I wanted?
- Will I get the same results using
~genotype + condition + genotype:condition
design? - Do I need to specify the reference level, or re-set the reference level at some point (for examle, for genotypeKO_A_vs_B)?
Thank you!