Question

DESeq2 lfcShrink: swapping reference gives different results for the same comparison

0

Entering edit mode

Daphnia • 0

@b83c1e8a

Last seen 12 months ago

Germany

Hi!

I noticed that my volcano plots resulting from lfcShrink() on contrast A_vs_B look different from the results of lfcShrink() on B_vs_A. I expected the absolute values of the l2FC to be approximately similar, but they are skewed towards the upregulated genes, i.e. the positive l2FC values tend to be higher in both comparisons. The unshrunken values behave as expected; the volcano plots are flipped along the y-axis. When I plotted the shrunken l2FCs of both contrasts against eachother, the vast majority was linearly distributed, but some (including some significant ones) deviated quite a bit from that. Thus, the ranking changes as well as the visulal representation. I tested this mostly using apeglm, but when I tried ashr, the absolute values didn't match up either. There are not too many DGEs in this data set, otherwise I might have missed it.

I hope you can help me clear this up, since intuitively, I don't see why the order of the contrast should have an impact on the absolute l2FC values and I'm not sure what it means for my experiment.

Thanks in advance! Maike

This is my code:

# A vs B
coldata$strain <- relevel(coldata$strain, 'B')
dds <- DESeqDataSetFromMatrix(countData = cts,
                              colData = coldata,
                              design = ~ donor + strain)
dds <- DESeq(dds)
# raw (non-shrunken) DEG results:
res_AB <- results(dds, name = "strain_A_vs_B")
# shrink:
map_AB <- lfcShrink(dds, coef = "strain_A_vs_B")

# B vs A
coldata$strain <- relevel(coldata$strain, 'A')
dds <- DESeqDataSetFromMatrix(countData = cts,
                              colData = coldata,
                              design = ~ donor + strain)
dds <- DESeq(dds)
# raw (non-shrunken) DEG results:
res_BA <- results(dds, name = "strain_B_vs_A")
# shrink:
map_BA <- lfcShrink(dds, coef = "strain_B_vs_A")

# l2fc plot:
# plot: compare L2FCs - all values:
df <- data.frame (row.names = rownames(as.data.frame(map_AB)))
df$l2fc_AB <- (as.data.frame(map_AB)$log2FoldChange)
df$l2fc_BA <- (as.data.frame(map_BA)$log2FoldChange)

Plot_01 <- ggplot(df, aes(x=l2fc_AB, y=l2fc_BA)) +
  geom_point()

Here are the volcano plots aand the scatter plot: This are the two resulting volcano plots: This is the l2fc scatter plot

DESeq2 apeglm • 776 views

ADD COMMENT • link 13 months ago • updated 12 months ago Daphnia • 0

score 3 · Accepted Answer · 2023-03-24

3

Entering edit mode

ATpoint ★ 4.0k

@atpoint-13662

Last seen 10 minutes ago

Germany

I asked the same some years back: apeglm -- influence of reference level on logFC and svalues

ADD COMMENT • link 13 months ago ATpoint ★ 4.0k

0

Entering edit mode

Thank you a lot! I missed that post when I was trying to to find out what is happenenig. The ranking missmatch I observed was indeed due to low readcounts in one of the saples and prefiltering removed the affected genes.

ADD REPLY • link 12 months ago Daphnia • 0