DESeq2 LRT output help
1
0
Entering edit mode
rclaire • 0
@rclaire-7689
Last seen 4.5 years ago
United States

Hello,

I am using DESeq2 to perform LRT analysis of a multi-treatment experiment, like an ANOVA. I have 5 different conditions, each with 4 replicates, and would like to generate a single p-value that indicates difference among the conditions, but not specific to any two conditions (as in a t-test).

I have run it as I best understand from the other threads I have read and the manual, and I want to be sure I understand the output. The commands I entered and the outputs generated in R are pasted below.

I can't tell from the output if the p-value is for comparing all 5 conditions, or just 12dpd 0 vs 12dpd 4. When I run mcols(res)$description (see below) it says "LRT p-value: '~ group' vs '~ 1'" and "log2 fold change (MLE): group 12dpd 4 vs 12dpd 0". Does this mean I am only generating the p-value based on differences between the two groups 12dpd 4 and 12dpd 0? or all of the groups? (12dpd 0 is the first and 12dpd 4 is the last in the colData set-up). R version 3.2.0 (2015-04-16) -- "Full of Ingredients" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-apple-darwin13.4.0 (64-bit) > countData<- read.table("/Users/claireriggs/R/12dpd_annot_shared_exp_051415.txt", header=TRUE, row.names=1) > colData<- read.table ("/Users/claireriggs/Desktop/mRNA/DESeq2/12dpd_sampleinfo.txt", header=TRUE, row.names=1) note: colData looks like this:  run trmt group 1 t0_1 t0 12dpd_0 2 t0_2 t0 12dpd_0 3 t0_3 t0 12dpd_0 4 t0_4 t0 12dpd_0 5 4hrs_anoxia_1 4hrs_anoxia 12dpd_1 6 4hrs_anoxia_2 4hrs_anoxia 12dpd_1 7 4hrs_anoxia_3 4hrs_anoxia 12dpd_1 8 4hrs_anoxia_4 4hrs_anoxia 12dpd_1 9 24hrs_anoxia_1 24hrs_anoxia 12dpd_2 10 24hrs_anoxia_2 24hrs_anoxia 12dpd_2 11 24hrs_anoxia_3 24hrs_anoxia 12dpd_2 12 24hrs_anoxia_4 24hrs_anoxia 12dpd_2 13 2hrs_recov_1 2hrs_recov 12dpd_3 14 2hrs_recov_2 2hrs_recov 12dpd_3 15 2hrs_recov_3 2hrs_recov 12dpd_3 16 2hrs_recov_4 2hrs_recov 12dpd_3 17 24hrs_recov_1 24hrs_recov 12dpd_4 18 24hrs_recov_2 24hrs_recov 12dpd_4 19 24hrs_recov_3 24hrs_recov 12dpd_4 20 24hrs_recov_4 24hrs_recov 12dpd_4 > dds <- DESeqDataSetFromMatrix(countData = countData, colData = colData, design= ~trmt) > design(dds) = ~ group > dds = DESeq(dds, test = "LRT", reduced = ~ 1) estimating size factors estimating dispersions gene-wise dispersion estimates mean-dispersion relationship -- note: fitType='parametric', but the dispersion trend was not well captured by the function: y = a/x + b, and a local regression fit was automatically substituted. specify fitType='local' or 'mean' to avoid this message next time. final dispersion estimates fitting model and testing > res <- results(dds) > summary(res) out of 52619 with nonzero total read count adjusted p-value < 0.1 LFC > 0 (up) : 24794, 47% LFC < 0 (down) : 22871, 43% outliers [1] : 149, 0.28% low counts [2] : 0, 0% (mean count < 0.2) [1] see 'cooksCutoff' argument of ?results [2] see 'independentFiltering' argument of ?results > mcols(res)$description
[1] "mean of normalized counts for all samples"        "log2 fold change (MLE): group 12dpd 4 vs 12dpd 0"
[3] "standard error: group 12dpd 4 vs 12dpd 0"         "LRT statistic: '~ group' vs '~ 1'"
[5] "LRT p-value: '~ group' vs '~ 1'"                  "BH adjusted p-values"

deseq2 LRT anova • 7.1k views
0
Entering edit mode

Got it! Thanks, Michael. And sorry I didn't find it on my own first.

0
Entering edit mode
@mikelove
Last seen 15 hours ago
United States

"I can't tell from the output if the p-value is for comparing all 5 conditions, or just 12dpd 0 vs 12dpd 4. When I run mcols(res)\$description (see below) it says "LRT p-value: '~ group' vs '~ 1'" and "log2 fold change (MLE): group 12dpd 4 vs 12dpd 0". Does this mean I am only generating the p-value based on differences between the two groups 12dpd 4 and 12dpd 0? or all of the groups? (12dpd 0 is the first and 12dpd 4 is the last in the colData set-up)."

Generally, the first place to go for help is the help file for the function. If you look in ?results it says in the Details section, under On p-values:

"For analyses using the likelihood ratio test (using nbinomLRT), the p-values are determined solely by the difference in deviance between the full and reduced model formula. A log2 fold change is included, which can be controlled using the name argument, or by default this will be the estimated coefficient for the last element of resultsNames(object)."

So in answer to your question, the p-value is for the comparison of ~group vs ~1 model, while the fold change is for one of the terms in the model. You can extract the fold changes one at a time with 'name' argument.

Note that the likelihood ratio tests for any differences across the grouping variable (so even if 1 group shows DE compared to the other 4).