DEseq2 anlysis - LfcThreshold and Lfcshrink options
ClaudiaE
Hello there,

I am having problems understanding the differential expression analysis from DEseq2. My data is RNAseq of a pathogen in planta. I have 4 biological replicates for each control and treatment samples and I want to compare them. I used Stringtie for assembly and then Tximport to extract the gene-count data. I have read the Vignette multiple times and several posts here but I am still confused about something:

After running the DEseq function, I applied the specific filters to my data - Lfcthreshold of 0.058 and alpha of 0.05. My understanding based on this post lfcThrehold on p-values is that if I filter results after the statistic test I will be doing a *post-hoc** test and invalidating the original results from the Wald test. But when I run the lfcShrinkage function, I cannot add the lfcThreshold because the output is going to be s-values which I do not want. Yet I see the LFC changes to 0 and it filters all the outliers and genes with low counts, but I have the same number of genes for up and down. I was just wondering if it is a correct process regardless of applying lfcThreshold and then Shrinkage or should I stick just with one?

I read other posts when they do both or just one want and it is never consistent and I know it is up to me anyways. It is just confusing to follow and I am hoping I can get some clarification. Thank you so much


dds <- DESeq(dds)
dds <- estimateSizeFactors(dds)
res <- results(dds)

out of 13948 with nonzero total read count
adjusted p-value < 0.1
LFC > 0 (up)       : 2999, 22%
LFC < 0 (down)     : 3372, 24%
outliers [1]       : 169, 1.2%
low counts [2]     : 1833, 13%
(mean count < 0)

res = res[complete.cases(res),]
summary(res)

out of 11947 with nonzero total read count
adjusted p-value < 0.1
LFC > 0 (up)       : 2999, 25%
LFC < 0 (down)     : 3372, 28%
outliers [1]       : 0, 0%
low counts [2]     : 0, 0%
(mean count < 0)

res <- results(dds,
name="condition_treatment_vs_control.",
alpha=0.05,lfcThreshold =0.585,
altHypothesis="greaterAbs")
summary(res)

out of 13948 with nonzero total read count
adjusted p-value < 0.05
LFC > 0.58 (up)    : 1592, 11%
LFC < -0.58 (down) : 1839, 13%
outliers [1]       : 169, 1.2%
low counts [2]     : 2093, 15%
(mean count < 1)

resDEG <- lfcShrink(dds, coef = "condition_treatment_vs_control.", type= "apeglm",
res=res)
resSig <- subset(resDEG, padj < 0.05)
mcols(res, use.names=TRUE)
summary(res)
nrow(res)

out of 3431 with nonzero total read count
adjusted p-value < 0.05
LFC > 0 (up)       : 1593, 46%
LFC < 0 (down)     : 1838, 54%
outliers [1]       : 0, 0%
low counts [2]     : 0, 0%
(mean count < 1)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results

@mikelove
lfcShrink doesn't change the p-values or adjusted p-values, you're just passing through the same information as in results(dds).

It will replace the p-values/adjusted p-values with s-values if you set svalue=TRUE, or if you ask lfcShrink to perform thresholded tests by specifying lfcThreshold=....

0
Thank you so much, Dr. Love. I compare my results with and without lfcShrink. I can notice the difference now clearly. My concern is more that I want P-values and adjusted P-values rather than S-values and I know if I include lfcThreshold as you mentioned, I will get them. I was confused about how to run both functions on my data.

I will just continue my analysis with the traditional padjusted.