Question

DESeq2 LRT output help

2

Entering edit mode

rclaire ▴ 20

@rclaire-7689

Last seen 7.1 years ago

United States

Hello,

I am using DESeq2 to perform LRT analysis of a multi-treatment experiment, like an ANOVA. I have 5 different conditions, each with 4 replicates, and would like to generate a single p-value that indicates difference among the conditions, but not specific to any two conditions (as in a t-test).

I have run it as I best understand from the other threads I have read and the manual, and I want to be sure I understand the output. The commands I entered and the outputs generated in R are pasted below.

I can't tell from the output if the p-value is for comparing all 5 conditions, or just 12dpd 0 vs 12dpd 4. When I run mcols(res)$description (see below) it says "LRT p-value: '~ group' vs '~ 1'" and "log2 fold change (MLE): group 12dpd 4 vs 12dpd 0". Does this mean I am only generating the p-value based on differences between the two groups 12dpd 4 and 12dpd 0? or all of the groups? (12dpd 0 is the first and 12dpd 4 is the last in the colData set-up).

R version 3.2.0 (2015-04-16) -- "Full of Ingredients"
Copyright (C) 2015 The R Foundation for Statistical Computing
Platform: x86_64-apple-darwin13.4.0 (64-bit)

> countData<- read.table("/Users/claireriggs/R/12dpd_annot_shared_exp_051415.txt", header=TRUE, row.names=1)

> colData<- read.table ("/Users/claireriggs/Desktop/mRNA/DESeq2/12dpd_sampleinfo.txt", header=TRUE, row.names=1)

note: colData looks like this:

	run	trmt	group
1	t0_1	t0	12dpd_0
2	t0_2	t0	12dpd_0
3	t0_3	t0	12dpd_0
4	t0_4	t0	12dpd_0
5	4hrs_anoxia_1	4hrs_anoxia	12dpd_1
6	4hrs_anoxia_2	4hrs_anoxia	12dpd_1
7	4hrs_anoxia_3	4hrs_anoxia	12dpd_1
8	4hrs_anoxia_4	4hrs_anoxia	12dpd_1
9	24hrs_anoxia_1	24hrs_anoxia	12dpd_2
10	24hrs_anoxia_2	24hrs_anoxia	12dpd_2
11	24hrs_anoxia_3	24hrs_anoxia	12dpd_2
12	24hrs_anoxia_4	24hrs_anoxia	12dpd_2
13	2hrs_recov_1	2hrs_recov	12dpd_3
14	2hrs_recov_2	2hrs_recov	12dpd_3
15	2hrs_recov_3	2hrs_recov	12dpd_3
16	2hrs_recov_4	2hrs_recov	12dpd_3
17	24hrs_recov_1	24hrs_recov	12dpd_4
18	24hrs_recov_2	24hrs_recov	12dpd_4
19	24hrs_recov_3	24hrs_recov	12dpd_4
20	24hrs_recov_4	24hrs_recov	12dpd_4

> dds <- DESeqDataSetFromMatrix(countData = countData, colData = colData, design= ~trmt)
> design(dds) = ~ group
> dds = DESeq(dds, test = "LRT", reduced = ~ 1)
estimating size factors
estimating dispersions
gene-wise dispersion estimates
mean-dispersion relationship
-- note: fitType='parametric', but the dispersion trend was not well captured by the
function: y = a/x + b, and a local regression fit was automatically substituted.
specify fitType='local' or 'mean' to avoid this message next time.
final dispersion estimates
fitting model and testing
> res <- results(dds)
> summary(res)

out of 52619 with nonzero total read count
adjusted p-value < 0.1
LFC > 0 (up) : 24794, 47%
LFC < 0 (down) : 22871, 43%
outliers [1] : 149, 0.28%
low counts [2] : 0, 0%
(mean count < 0.2)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results

> mcols(res)$description
[1] "mean of normalized counts for all samples" "log2 fold change (MLE): group 12dpd 4 vs 12dpd 0"
[3] "standard error: group 12dpd 4 vs 12dpd 0" "LRT statistic: '~ group' vs '~ 1'"
[5] "LRT p-value: '~ group' vs '~ 1'" "BH adjusted p-values"

deseq2 LRT anova • 9.1k views

ADD COMMENT • link 8.9 years ago rclaire ▴ 20

0

Entering edit mode

Got it! Thanks, Michael. And sorry I didn't find it on my own first.

ADD REPLY • link 8.9 years ago rclaire ▴ 20

score 2 · Answer 1 · 2015-05-19

"I can't tell from the output if the p-value is for comparing all 5 conditions, or just 12dpd 0 vs 12dpd 4. When I run mcols(res)$description (see below) it says "LRT p-value: '~ group' vs '~ 1'" and "log2 fold change (MLE): group 12dpd 4 vs 12dpd 0". Does this mean I am only generating the p-value based on differences between the two groups 12dpd 4 and 12dpd 0? or all of the groups? (12dpd 0 is the first and 12dpd 4 is the last in the colData set-up)."

Generally, the first place to go for help is the help file for the function. If you look in ?results it says in the Details section, under On p-values:

"For analyses using the likelihood ratio test (using nbinomLRT), the p-values are determined solely by the difference in deviance between the full and reduced model formula. A log2 fold change is included, which can be controlled using the name argument, or by default this will be the estimated coefficient for the last element of resultsNames(object)."

So in answer to your question, the p-value is for the comparison of ~group vs ~1 model, while the fold change is for one of the terms in the model. You can extract the fold changes one at a time with 'name' argument.

Note that the likelihood ratio tests for any differences across the grouping variable (so even if 1 group shows DE compared to the other 4).