I am trying to analyze the RNAseq data using Salmon + DEseq2. I have two variables: genotype (WT or KO) + condition ( treatment A or B), as shown below:
>samples # run genotype condition #sample1 sample0001_quant WT A #sample2 sample0002_quant WT B #sample3 sample0003_quant KO A #sample4 sample0004_quant KO B #sample5 sample0005_quant WT A #sample6 sample0006_quant WT B #sample7 sample0007_quant KO A #sample8 sample0008_quant KO B #sample9 sample0009_quant WT A #sample10 sample0010_quant WT B #sample11 sample0011_quant KO A #sample12 sample0012_quant KO B
I am interested in comparing mRNA expression with:
conditionA_WT_vs_KO, WT as the reference level.
conditionB_WT_vs_KO, WT as the reference level.
genotypeKO_A_vs_B, conditionA as the reference level.
After I read other threads, I thought that there are two ways to do that:
- combine the factors of interest into a single factor:
design(dds) <- ~ group
- just add the interactions:
design(dds) < ~genotype + condition + genotype:condition
I could not figure it out how to do the pairwise comparison using the second way, so I decided to use the group factor:
#Read the counts from salmon library("tximport") library("jsonlite") library("readr") txi <- tximport(files, type="salmon", tx2gene=tx2gene, ignoreTxVersion = TRUE) #Analyze the data library("DESeq2") ddsTxi <- DESeqDataSetFromTximport(txi, colData = samples, design = ~ genotype + condition) ddsTxi$group <- factor(paste0(ddsTxi$genotype, ddsTxi$condition)) design(ddsTxi) <- ~group dds <- DEseq(ddsTxi) conditionA_WT_vs_KO <- results(dds, contrast=c("group", "WTA", "KOA")) conditionB_WT_vs_KO <- results(dds, contrast=c("group", "WTB", "KOB")) genotypeKO_A_vs_B <- results(dds, contrast=c("group", "KOA", "KOB"))
- Did I compare the samples correctly as I wanted?
- Will I get the same results using
~genotype + condition + genotype:conditiondesign?
- Do I need to specify the reference level, or re-set the reference level at some point (for examle, for genotypeKO_A_vs_B)?