Question

Error bars DESeq or DESeq2 fold change

1

Entering edit mode

lisa.crossman ▴ 10

@lisacrossman-10085

Last seen 8.3 years ago

Hi,

Some time ago I ran DESeq (one) on a set of data comparing two conditions of growth over 4 biological replicates per each condition.

A collaborator constructed a graph of fold change from that output for approximately 15 genes of interest.

I have a request from a peer reviewer to provide error bars on that graph.

1. Does it make sense to provide error bars on a graph of FoldChange in this way?

2. If so can I generate them for FoldChange either in DESeq or DESeq2 with an R command? I do have lfcSE from DESeq2 but 2^lfcSE does not seem to make sense, is there a way to generate?

3. Can I get baseMeanA and baseMean B from DESeq2 or would that be identical to the DESeq1 output?

Thanks,

Lisa

rnaseq deseq2 deseq standard error confidence interval • 11k views

ADD COMMENT • link updated 2.7 years ago by Michael Love 42k • written 8.3 years ago by lisa.crossman ▴ 10

score 5 · Answer 1 · 2016-04-12

5

Entering edit mode

Michael Love 42k

@mikelove

Last seen 9 hours ago

United States

Yes you can add error bars to a (log) fold change.

First, I want to say that I always want to see log fold changes, not fold changes, on a plot. The reason is that log fold changes have the property that equal multiplicative changes in either direction (say, up- or down-regulated) cover equal distances on the plot.

Compare:

lfc <- -4:4
par(mfrow=c(1,2))
dotchart(lfc);abline(v=0)
dotchart(2^lfc);abline(v=1)

It's very difficult to see in the second plot that the multiplicative effects to the left of the vertical bar (no change) are equal and opposite effects as those to the right.

Estimated standard errors for the estimated coefficients on the log2 scale are given by the lfcSE column. Yes, you can convert these to fold change errors using this formula: 2^(coef - SE) and 2^(coef + SE).

If you multiple lfcSE by normal quantiles (e.g. qnorm(.025)), you can construct confidence intervals for the coefficients.

DESeq2 is more general purpose that DESeq and so doesn't provide baseMean A and B. You can easily calculate this yourself though. See:

A: DESeq2 baseMean values for each sample

ADD COMMENT • link 8.3 years ago Michael Love 42k

0

Entering edit mode

Michael Love is it possible to extract the log fold change of a specific group? I tried to produce the log2FoldChange of a specific gene for 2 different conditions using the following code chunk with a DESeqDataSet object.

gene = sapply(levels(dds$condition), function(lvl) counts(dds, normalized = T)["ENSG00000189221.9",dds$condition %in% lvl])
untrt = log2(colMeans(gene)[1]/colMeans(gene)[2])
trt = log2(colMeans(gene)[2]/colMeans(gene)[1])

However, the value I got is different from the log2FoldChange from the result table. How can I obtain the log fold change value to create an error plot?

ADD REPLY • link 2.7 years ago p.l.chen • 0

0

Entering edit mode

I don't know what you mean by "LFC for a specific group".

Maybe post your question with full details, see the support posting guide:

bioconductor.org/help/support/posting-guide

ADD REPLY • link 2.7 years ago Michael Love 42k

score 3 · Answer 2 · 2016-04-12

Hi

1. Yes, I think error bars do make sense here. The results table returned by DESeq2 contains the columns 'log2FoldChange' with the (shrunken) fold change estimates, and right next to it, the standard errors of these estimates ('lfcSE'). If you add and subtract the standard error to the fold change estimates.you get positions for the upper and lower whiskers of your error bars. Note that these are standard errors, not 95%-CIs, and that there is not correction for multiple testing.

2. You should use 2^(lfc+lfcSE) and 2^(lfc-lfcSE).

Now, this will produce ugly looking asymmetric error bars. My recommendation is to always plot fold changes on a logarithmic scale. After all, you would want a change of 2-fold up (2x) and two-fold down (0.5x) to look the same, though in different directions. Sometimes, people are worried that mathematically less savvy readers find log2-transformed values harder to read or interpret intuitively than raw values. My recommendation here is to nevertheless stick to log-scale axes, but label them in the real scale: Do not label the y axis "log2 fold change" and mark off the equidistant ticks as -3, -2, -1, 0, 1, 2, 3, but label the axis as "fold change" and mark the ticks as 1/8, 1/4, 1/2, 1, 2, 4, 8.

3. The two baseMean values were simply the averages of the normalized counts for the two groups. As DESeq2 is layed out for more general designs, we removed this left-over from the earliest DESeq versions which only allowed for two-group comparisons. If you still need an average over the group, run 'rowMeans' on a suitable subset of columns from 'counts( dds, normalized=TRUE )'.

score 0 · Answer 3 · 2016-04-12

0

Entering edit mode

lisa.crossman ▴ 10

@lisacrossman-10085

Last seen 8.3 years ago

Brilliant thank you! I have generated your above figure to discuss with the collaborators

ADD COMMENT • link 8.3 years ago lisa.crossman ▴ 10

0

Entering edit mode

Thanks so much! I have now generated the numbers for the upper and lower whiskers and the BaseMeanA and BaseMeanB. I have plotted a few example genes using the lower and upper numbers from the lfc+/-lfcSE as ymin and ymax to ggplot2 geom_errorbar().

ADD REPLY • link 8.3 years ago lisa.crossman ▴ 10