Question

All same padj of contrast results in the DESeq2

0

Entering edit mode

jjinhyoungkim ▴ 20

@jjinhyoungkim-8476

Last seen 8.6 years ago

Canada

Hi,

I am trying to run DESeq2 with the duplicated raw counts from 4 groups and 2 conditions. In the comparison using "result ( contrast)", some comparisons had all same padj (0.9999874). Is it normal or anything wrong? FYI, the code are below. Thank you in advance for your time.

>library("DESeq2")
>countMatrix = read.table("~/Desktop/aaa.txt",header = T,row.names = 1)
>coldata = data.frame(row.names =colnames(countMatrix),group =rep(c("gt1","gt2","gt3","gt4"),2,each = 2),treatment = rep(c("control","treated"),each= 8))
>coldata$treatment = factor(x = coldata$treatment,levels = c('control','treated'))>dds = DESeqDataSetFromMatrix(countData =countMatrix, colData = coldata, design = ~ group + treatment +group:treatment)
>dds = DESeq(dds)
> resultsNames(dds)
 [1] "Intercept"       "groupgt1"       "groupgt2"       "groupgt3"                 [5] "groupgt4"   "treatmentcontrol"    "treatmenttreated" "groupgt1.treatmentcontrol"
 [9] "groupgt2.treatmentcontrol" "groupgt3.treatmentcontrol" "groupgt4.treatmentcontrol" "groupgt1.treatmenttreated"
[13] "groupgt2.treatmenttreated" "groupgt3.treatmenttreated" "groupgt4.treatmenttreated"
> res <-results(dds, contrast=c("treatment", "treated", "control"))
> res

log2 fold change (MAP): treatment treated vs control
Wald test p-value: treatment treated vs control
DataFrame with 35584 rows and 6 columns
                 baseMean log2FoldChange     lfcSE      stat       pvalue         padj
                <numeric>      <numeric> <numeric> <numeric>    <numeric>    <numeric>
unigene22422654 82.11115       2.753315 0.2880473 9.558550 1.194176e-21 1.863751e-17
unigene22392883 110.42190       2.742819 0.3119236 8.793239 1.453092e-18 1.133920e-14
unigene22410436 94.91773       2.054046 0.2427170 8.462723 2.612059e-17 1.306140e-13
unigene22398806 133.11805       2.269261 0.2690689 8.433751 3.347576e-17 1.306140e-13
unigene22391050 159.42752       1.589496 0.1900555 8.363325 6.097477e-17 1.903266e-13
...                   ...            ...       ...       ...          ...          ...
unigene22403397         0             NA        NA        NA           NA           NA
unigene22417358         0             NA        NA        NA           NA           NA
unigene22397386         0             NA        NA        NA           NA           NA
unigene22415322         0             NA        NA        NA           NA           NA
unigene22402022         0             NA        NA        NA           NA           NA

> res1 <- results(dds, contrast=list("groupgt2.treatmenttreated", "groupgt2.treatmentcontrol"))

> res1
log2 fold change (MAP): groupgt2.treatmenttreated vs groupgt2.treatmentcontrol
Wald test p-value: groupgt2.treatmenttreated vs groupgt2.treatmentcontrol
DataFrame with 35584 rows and 6 columns
                 baseMean log2FoldChange     lfcSE       stat    pvalue      padj
                <numeric>      <numeric> <numeric> <numeric> <numeric> <numeric>
unigene22408676 23.641057     -0.2285822 0.5213050 -0.4384808 0.6610378 0.9999874
unigene22408675 1.032055     -0.6212150 1.2989812 -0.4782325 0.6324848 0.9999874
unigene22408674 11.503303     -0.1884673 0.5643655 -0.3339456 0.7384206 0.9999874
unigene22408673 1.849157      1.5855804 1.1773601 1.3467251 0.1780688 0.9999874
unigene22408672 2.717044     -1.2518670 1.0399068 -1.2038261 0.2286568 0.9999874
...                   ...            ...       ...        ...       ...       ...
unigene22403397         0             NA        NA         NA        NA        NA
unigene22417358         0             NA        NA         NA        NA        NA
unigene22397386         0             NA        NA         NA        NA        NA
unigene22415322         0             NA        NA         NA        NA        NA
unigene22402022         0             NA        NA         NA        NA        NA

deseq2 padj contrast results • 4.8k views

ADD COMMENT • link updated 8.7 years ago by Michael Love 41k • written 8.7 years ago by jjinhyoungkim ▴ 20

0

Entering edit mode

Ryan C. Thompson ★ 7.9k

@ryan-c-thompson-5618

Last seen 8 months ago

Scripps Research, La Jolla, CA

If all the adjusted p-values are 1, then that means the algorithm didn't find any significantly differentially expressed genes for that contrast.

ADD COMMENT • link 8.7 years ago Ryan C. Thompson ★ 7.9k

score 6 · Accepted Answer · 2015-08-18

Repeated values for the adjusted p-value is a direct consequence of the method of Benjamini and Hochberg. The paper is quite accessible and worth a look:

http://www.jstor.org/stable/2346101

See formula (1) under "False Discovery Rate Controlling Procedure".

All the p-values p_i up to p_k also get rejected at the FDR q.

For a visualization of this formula, take a few sorted p-values with adjusted p-value written above:

p <- sort(c(.01,.2,.21,.22,.5,.51,.52,.8,.9))
padj <- p.adjust(p, method="BH")
plot(p,ylim=c(0,1),xlim=c(0,length(p)))
text(seq_along(p),p,round(padj,3),pos=3)
abline(0,padj[4]/length(p))

To translate from the formula to the code above, k = 4, m=length(p), and q = padj[4].