**0**

**0**wrote:

I'm having a bit of trouble understanding how `lfcThreshold`

parameter of the `results`

function of `DESeq2`

affects the p-value. Imposing a stricter `lfcThreshold`

reduces the number of significant results, which makes sense, however, it does so above and beyond what I would have expected.

From my data:

lfc <- 0 thrHuh <- results(dds_huh7, contrast = c("group", "infected_3", "uninfected_3"), alpha = alpha, lfcThreshold = lfc) summary(thrHuh) out of 72860 with nonzero total read count adjusted p-value < 0.05 LFC > 0 (up) : 1545, 2.1% LFC < 0 (down) : 1278, 1.8% outliers [1] : 39, 0.054% low counts [2] : 42376, 58% (mean count < 6) sig <- subset(thrHuh, thrHuh$padj < 0.05) nrow(subset(sig, sig$log2FoldChange >= 2 | sig$log2FoldChange <= -2)) [1] 495

My interpretation of this is that I have 495 (significant) genes that have a greater than 4-fold change. My (obviously incorrect) understanding is that if I set `lfcThreshold <- 2`

I would get the same results, but I don't:

lfc <- 2 thrHuh <- results(dds_huh7, contrast = c("group", "infected_3", "uninfected_3"), alpha = alpha, lfcThreshold = lfc) summary(thrHuh) out of 72860 with nonzero total read count adjusted p-value < 0.05 LFC > 0 (up) : 21, 0.029% LFC < 0 (down) : 3, 0.0041% outliers [1] : 39, 0.054% low counts [2] : 8476, 12% (mean count < 0)

I've looked at the DESEq2 paper, the vignette, and the workflow, and cannot figure why one method gives 495 and the other 24. Is it simply because there are fewer tests being run in the latter case, and this is reflected in the adjusted p value? If I only want to consider genes that are a) significant and b) have at least a 4-fold change or more, which is the better one to use?

Thanks,

Russ

edits: formatting mistakes...

**49k**• written 15 months ago by Russ Fraser •

**0**