DESeq2 LRT output help
1
2
Entering edit mode
rclaire ▴ 20
@rclaire-7689
Last seen 7.7 years ago
United States

Hello, 

I am using DESeq2 to perform LRT analysis of a multi-treatment experiment, like an ANOVA. I have 5 different conditions, each with 4 replicates, and would like to generate a single p-value that indicates difference among the conditions, but not specific to any two conditions (as in a t-test).

I have run it as I best understand from the other threads I have read and the manual, and I want to be sure I understand the output. The commands I entered and the outputs generated in R are pasted below.

I can't tell from the output if the p-value is for comparing all 5 conditions, or just 12dpd 0 vs 12dpd 4. When I run mcols(res)$description (see below) it says "LRT p-value: '~ group' vs '~ 1'" and "log2 fold change (MLE): group 12dpd 4 vs 12dpd 0". Does this mean I am only generating the p-value based on differences between the two groups 12dpd 4 and 12dpd 0? or all of the groups? (12dpd 0 is the first and 12dpd 4 is the last in the colData set-up).

R version 3.2.0 (2015-04-16) -- "Full of Ingredients"
Copyright (C) 2015 The R Foundation for Statistical Computing
Platform: x86_64-apple-darwin13.4.0 (64-bit)

> countData<- read.table("/Users/claireriggs/R/12dpd_annot_shared_exp_051415.txt", header=TRUE, row.names=1)

> colData<- read.table ("/Users/claireriggs/Desktop/mRNA/DESeq2/12dpd_sampleinfo.txt", header=TRUE, row.names=1)

note: colData looks like this:

  run trmt group
1 t0_1 t0 12dpd_0
2 t0_2 t0 12dpd_0
3 t0_3 t0 12dpd_0
4 t0_4 t0 12dpd_0
5 4hrs_anoxia_1 4hrs_anoxia 12dpd_1
6 4hrs_anoxia_2 4hrs_anoxia 12dpd_1
7 4hrs_anoxia_3 4hrs_anoxia 12dpd_1
8 4hrs_anoxia_4 4hrs_anoxia 12dpd_1
9 24hrs_anoxia_1 24hrs_anoxia 12dpd_2
10 24hrs_anoxia_2 24hrs_anoxia 12dpd_2
11 24hrs_anoxia_3 24hrs_anoxia 12dpd_2
12 24hrs_anoxia_4 24hrs_anoxia 12dpd_2
13 2hrs_recov_1 2hrs_recov 12dpd_3
14 2hrs_recov_2 2hrs_recov 12dpd_3
15 2hrs_recov_3 2hrs_recov 12dpd_3
16 2hrs_recov_4 2hrs_recov 12dpd_3
17 24hrs_recov_1 24hrs_recov 12dpd_4
18 24hrs_recov_2 24hrs_recov 12dpd_4
19 24hrs_recov_3 24hrs_recov 12dpd_4
20 24hrs_recov_4 24hrs_recov 12dpd_4


> dds <- DESeqDataSetFromMatrix(countData = countData, colData = colData, design= ~trmt)
> design(dds) = ~ group
> dds = DESeq(dds, test = "LRT", reduced = ~ 1)

estimating size factors
estimating dispersions
gene-wise dispersion estimates
mean-dispersion relationship
-- note: fitType='parametric', but the dispersion trend was not well captured by the
   function: y = a/x + b, and a local regression fit was automatically substituted.
   specify fitType='local' or 'mean' to avoid this message next time.
final dispersion estimates
fitting model and testing
> res <- results(dds)
> summary(res)

out of 52619 with nonzero total read count
adjusted p-value < 0.1
LFC > 0 (up)     : 24794, 47% 
LFC < 0 (down)   : 22871, 43% 
outliers [1]     : 149, 0.28% 
low counts [2]   : 0, 0% 
(mean count < 0.2)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results

> mcols(res)$description
[1] "mean of normalized counts for all samples"        "log2 fold change (MLE): group 12dpd 4 vs 12dpd 0"
[3] "standard error: group 12dpd 4 vs 12dpd 0"         "LRT statistic: '~ group' vs '~ 1'"               
[5] "LRT p-value: '~ group' vs '~ 1'"                  "BH adjusted p-values" 

 

 

deseq2 LRT anova • 9.5k views
ADD COMMENT
0
Entering edit mode

Got it! Thanks, Michael. And sorry I didn't find it on my own first. 

ADD REPLY
2
Entering edit mode
@mikelove
Last seen 4 hours ago
United States

"I can't tell from the output if the p-value is for comparing all 5 conditions, or just 12dpd 0 vs 12dpd 4. When I run mcols(res)$description (see below) it says "LRT p-value: '~ group' vs '~ 1'" and "log2 fold change (MLE): group 12dpd 4 vs 12dpd 0". Does this mean I am only generating the p-value based on differences between the two groups 12dpd 4 and 12dpd 0? or all of the groups? (12dpd 0 is the first and 12dpd 4 is the last in the colData set-up)."

Generally, the first place to go for help is the help file for the function. If you look in ?results it says in the Details section, under On p-values:

"For analyses using the likelihood ratio test (using nbinomLRT), the p-values are determined solely by the difference in deviance between the full and reduced model formula. A log2 fold change is included, which can be controlled using the name argument, or by default this will be the estimated coefficient for the last element of resultsNames(object)."

So in answer to your question, the p-value is for the comparison of ~group vs ~1 model, while the fold change is for one of the terms in the model. You can extract the fold changes one at a time with 'name' argument.

Note that the likelihood ratio tests for any differences across the grouping variable (so even if 1 group shows DE compared to the other 4).

ADD COMMENT

Login before adding your answer.

Traffic: 817 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6