Hello, I am working on a differential expression analysis comparing samples with different tissue color. I have 10 samples from dark tissue and 6 samples from white tissue. I created my DESeq dataset as follows:
dds <- DESeqDataSetFromMatrix(countData = counts.in, colData = design.matrix, design = ~ seed_color + batch)
row.names(dds) <- row.names(counts.in)
keep <- rowSums(counts(dds) >= 10) >= 3
dds <- dds[keep,]
I tried to find differentially expressed genes using a lfc threshold and used apeglm for lfc shrinkage.
dds <- DESeq(dds)
res <- results(dds,
contrast = c("tissue_color","white","dark"),
alpha = 0.01)
res.shrunk <- lfcShrink(dds,
res=res,
coef = 2,
lfcThreshold = 1,
type = "apeglm")
summary(res.shrunk, alpha=0.01)
I visualized the results in a volcano plot and found that the fsos rate is set to 0 for genes with negative lfcs. However, when I releved the tissue_color factor and used the other tissue color as reference level, I got a slightly different number of significant genes and again a number of genes with fsos rate 0 only for negative lfcs.
dds$tissue_color <- relevel(dds$tissue_color, "white")
# before relevel:
> summary(res.comp1, alpha=0.01)
out of 19516 with nonzero total read count
s-value < 0.01
LFC > 1.00 (up) : 2040, 10%
LFC < -1.00 (down) : 1956, 10%
# after relevel:
> summary(res.comp2, alpha=0.01)
out of 19516 with nonzero total read count
s-value < 0.01
LFC > 1.00 (up) : 1960, 10%
LFC < -1.00 (down) : 2039, 10%
The colored points show the genes more highly expressed in that tissue (before and after releveling), with fsos rate equal to 0 in the top left of the plot. I am wondering why I see only genes with negative lfc set to 0, even when changing the reference level of the analysis. I would have expected that only the lfc sign would change after releveling. Is this behaviour expected or is there something wrong with my analysis? Thank you kindly for any help.
Dear Dr. Love, thank you very much for your answer. I think I need to specify my problem better. I did run the computation again after releveling. Between the two comparisons, I changed the coefficient from "tissue_color_white_vs_dark" to "tissue_color_dark_vs_white". Perhaps this code makes it more clear:
I noticed in both comparisons genes with svalue == 0. However, even after changing the reference level of the comparison, I find only genes with negative LFC that have an svalue of 0.
Why do I see svalues of 0 only for genes with negative LFCs, even after changing the order of tissues in the coefficient? Is this to be expected or should I recheck my analysis?
I think it’s not a relevant problem. There is an arbitrary point when R switches from a very very small number to 0. I would focus instead on a relevant threshold eg svalue < .001 or some error rate that you want to control.