Question

Performing simple pairwise comparisons in DESEQ2

0

Entering edit mode

Raito92 ▴ 60

@raito92-20399

Last seen 3.0 years ago

Italy

Hello everyone, I've been working on a RNASeq project requiring an accurate DEG analysis. I've always used a pipeline involving edgeR, and, more specifically, the function glmQLFTest (as explained in the workflow RNASeqEdgeRQL). However, I'd like to perform my anaylsis with DESEQ2 as well . Basically, edgeR seems to perform pairwise analysis between couples of samples, for example, given the following experimental design:

Sample Variable1 Variable2

1 A x

2 A x

3 A x

4 A y

5 A y

6 A y

7 B x

8 B x

9 B x

10 B y

11 B y

12 B y

group <- paste(targets$Variable1, targets$Variable2, sep=".")
group <- factor(group)
design <- model.matrix(~0+group)

Contrasts <- makeContrasts(A.x-B.x, levels=design)
res <- glmQLFTest(fit, contrast=Contrasts)

Contrasts2 <- makeContrasts(A.x-B.y, levels=design)
res2 <- glmQLFTest(fit, contrast2=Contrasts2)

Contrasts3 <- makeContrasts(A.y-B.x, levels=design)
res3 <- glmQLFTest(fit, contrast3=Contrasts3)

Contrasts4 <- makeContrasts(A.y-B.y, levels=design)
res4 <- glmQLFTest(fit, contrast4=Contrasts4)

I've read the vignette several times, but I'm not sure about how to perform this simple analysis with DESEQ2.

I think I may work on the 'Interactions' in DESEQ2, so my question is...

Is the following code performing 4 pairwise comparisons, like edgeR did?

I wrote this (from Interactions paragraph in DESEQ2 vignette)

dds$group <- factor(paste0(dds$Variable1, dds$Variable2))
design(dds) <- ~ group
dds <- DESeq(dds)
resultsNames(dds)
results(dds, contrast=c("group", "Ax", "Ay", "Bx", "By"))

Also, it's not clear, since I have to define the dds object before this, how the design formula has to be... maybe like:

dds <- DESeqDataSetFromMatrix(countData = cts, colData = coldata, design = ~ Variable1 + Varible2)

or possibly

dds <- DESeqDataSetFromMatrix(countData = cts, colData = coldata, design = ~ Variable1 + Varible2 + Variable1:Variable2 )

From the help of this specific function, while defining the design formula, I read:

*a formula which expresses how the counts for each gene depend on the variables in colData. Many R formula are valid, including designs with multiple variables, e.g., ~ group + condition, and designs with interactions, e.g., ~ genotype + treatment + genotype:treatment

What is exactly the difference between ~ Variable1 + Variable2 and ~ Variable1 + Variable2 + Variable1:Variable2?

Thanks in advance for your help!

```

edgeR DEG RNASeq DESEQ2 • 5.2k views

ADD COMMENT • link updated 3.4 years ago by Michael Love 43k • written 3.4 years ago by Raito92 ▴ 60

score 0 · Answer 1 · 2022-02-02

0

Entering edit mode

Michael Love 43k

@mikelove

Last seen 21 days ago

United States

Pairwise analysis is an FAQ at the end of our vignette.

ADD COMMENT • link 3.4 years ago Michael Love 43k

0

Entering edit mode

I saw that. but it still doesn't answer my question.

Can I use DESeq2 to analyze paired samples?

Yes, you should use a multi-factor design which includes the sample information as a term in the design formula. This will account for differences between the samples while estimating the effect due to the condition. The condition of interest should go at the end of the design formula, e.g. ~ subject + condition.

It doesn't expand on how to do this between specific values of 'subject' and 'condition'.

ADD REPLY • link 3.4 years ago Raito92 ▴ 60

0

Entering edit mode

I think I misunderstood. You don't have paired samples I see.

You can make group before making the DESeqDataSet. Make this variable in the colData.

ADD REPLY • link 3.4 years ago Michael Love 43k

0

Entering edit mode

Thanks for your answer, how about performing multiple comparisons instead?

Again, in the [RNASeqGeneEdgeRQL][1] in the paragraph '[Analysis of Deviance][2]', it's described how to extend a comparison between two groups to three or more, in a specific experimental design.

The aim is to identify genes which are DE among 3 or more groups (So, for istance, among the comparisons A.x-A.y , B.x-B.y and A.x-B.x). An output table is reported, showing logFoldChange and logCPM for each group, and just one statistical value per gene (p-value, FDR, F-statistic).

How can an equivalent analysis be performed in DESEQ2?

Thanks in advance for your help

ADD REPLY • link 3.4 years ago Raito92 ▴ 60

0

Entering edit mode

You can use a LRT in DESeq2 to test more than one coefficient at a time.

https://master.bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#likelihood-ratio-test

For setting up the statistical design and interpreting coefficient, I recommend to consult with local statisticians. On the support site, I have to restrict my time to software related questions.

ADD REPLY • link 3.4 years ago Michael Love 43k

0

Entering edit mode

Thanks for your answer, how about performing multiple comparisons instead?

Again, in the RNASeqGeneEdgeRQL workflow (using edgeR) in the paragraph 'Analysis of Deviance', it's described how to extend a comparison between two groups to three or more, in a specific experimental design.

The aim is to identify genes which are DE among 3 or more groups (So, for istance, among the comparisons A.x-A.y , B.x-B.y and A.x-B.x). An output table is reported, showing logFoldChange and logCPM for each group, and just one statistical value per gene (p-value, FDR, F-statistic).

How can an equivalent analysis be performed in DESEQ2?

Thanks in advance for your help

ADD REPLY • link 3.4 years ago Raito92 ▴ 60