Question: Help with design matrix dealing with 3 variables (DESeq2)
0
2.4 years ago by
martin.weihrauch10 wrote:

I have an RNA-Seq dataset with 3 variables and 45 samples:

muscle (A or B)

genotype (WT or TG)

region (1 or 2)

So I want to compare e.g. WT/A/1 against WT/A/2, but also all other possible comparisons.

So far I've used the following design:

dds <- DESeqDataSetFromMatrix(counttable, colData = coldata, design= ~ genotype + muscle + region)
dds <- dds[rowSums(counts(dds)) > 1, ] # Removes rows with zero reads in all samples
dds <- DESeq(dds) # Runs DESeq2

dds2 <- dds # Copy dds to dds2
dds2$group <- factor(paste0(dds2$genotype, dds2$muscle, dds2$region))
design(dds2) <- ~ group
dds2 <- DESeq(dds2)

resultsNames(dds2) # Here I get all the possible groups and then I could finally contrast them

results_WT/A/1_vs_WT/A/2 <- results(dds2, contrast=c("group", "WT/A/1", "WT/A/2"))

However, if I do the whole analysis with only one set of conditions I'd liek to compare, my differential expression list will look quite a lot different. e.g. I only use featureCounts on those .bam files that correspond to WT/A for region 1 and 2, so I can easily contrast WT/A/1 vs. WT/A/2. Just the result is quite different from using the design I showed above.

Am I doing this correctly in general, and which method should I use to correctly interpret my RNA-Seq dataset specifically (~ group design or doing the comparisons one by one with a simple design e.g. design = ~ region     ... and only using the files for one condition).

Thanks.

deseq2 rna-seq • 1.0k views
modified 2.4 years ago by Michael Love26k • written 2.4 years ago by martin.weihrauch10

Can you give more description on what region?

If you consider only Muscle type and genotype this design will suite good design=~ genotype+ muscle +muscle:genotype

You will have the expression across the genotypes in different muscle groups

The region is the most important part, as it is laser-capture microdissected muscle tissue.

It would be ideal to be able to compare the two different types of muscle at the region of interest vs. the control region. The genotype would be maybe a little secondary for now, but still I want to see whether the genotype brings changes in the region of interest and between the muscles.

Answer: Help with design matrix dealing with 3 variables (DESeq2)
0
2.4 years ago by
Michael Love26k
United States
Michael Love26k wrote:

There doesn't seem to be any reason to run DESeq() twice here. The results from the first run are wiped out by the second. The first line with DESeq() can be deleted.

"the result is quite different [running altogether or as separate groups]"

This is actually quite a common question, and it's listed as a FAQ in the DESeq2 vignette.

The general recommendation is to run with all the samples together, but I have more specific advice in the vignette.

Dear Michael,