Hi,
I was running for Deseq2 on the same set of genes and samples with varying alpha as shown below.
Why do adjust p-value changes for the same gene when alpha changes, even though the pvalue remains same?
I thought the alpha is the fdr cutoff and the "NA" means that it didn't cleared the FDR cutoff. Is my interpretation correct?
dds <- DESeq(dds, parallel=TRUE )
alpha <- 0.05
Con2vsCon1 <- results(dds,contrast=c(“Condition”,”Con2”,”Con1”),alpha=alpha)
write.table(Con2vsCon1, "Con2vsCon1_0.05.xls", sep="\t")
The result file:
baseMean | log2FoldChange | lfcSE | stat | pvalue | padj |
8650.108665 | 1.016239114 | 0.144542498 | 7.030728877 | 2.05E-12 | 1.55E-08 |
423.1219264 | 0.935612819 | 0.13608645 | 6.875135773 | 6.19E-12 | 2.34E-08 |
36.5910756 | 1.247356344 | 0.189428319 | 6.58484618 | 4.55E-11 | 1.15E-07 |
364.305668 | 0.815937662 | 0.125767764 | 6.487653407 | 8.72E-11 | 1.35E-07 |
2463.544709 | 0.835643838 | 0.128866592 | 6.48456536 | 8.90E-11 | 1.35E-07 |
11.02946987 | -0.000463788 | 0.242314322 | -0.001913993 | 0.998472856 | 0.998737002 |
alpha <- 0.01
Con2vsCon1 <- results(dds,contrast=c(“Condition”,”Con2”,”Con1”),alpha=alpha)
write.table(Con2vsCon1, "Con2vsCon1_0.01.xls", sep="\t")
baseMean | log2FoldChangen | lfcSEn | statn | pvaluen | padj |
8650.108665 | 1.016239114 | 0.144542498 | 7.030728877 | 2.05E-12 | 6.56E-09 |
423.1219264 | 0.935612819 | 0.13608645 | 6.875135773 | 6.19E-12 | 9.89E-09 |
36.5910756 | 1.247356344 | 0.189428319 | 6.58484618 | 4.55E-11 | 4.85E-08 |
364.305668 | 0.815937662 | 0.125767764 | 6.487653407 | 8.72E-11 | 5.68E-08 |
2463.544709 | 0.835643838 | 0.128866592 | 6.48456536 | 8.90E-11 | 5.68E-08 |
11.02946987 | -0.000463788 | 0.242314322 | -0.001913993 | 0.998472856 | NA |
Session info
R version 3.3.1 (2016-06-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] pheatmap_1.0.7 DESeq2_1.12.4
[3] SummarizedExperiment_1.4.0 Biobase_2.32.0
[5] GenomicRanges_1.26.1 GenomeInfoDb_1.8.7
[7] IRanges_2.8.0 S4Vectors_0.12.0
[9] BiocGenerics_0.20.0
loaded via a namespace (and not attached):
[1] Rcpp_0.12.8 RColorBrewer_1.1-2 plyr_1.8.4
[4] XVector_0.12.1 tools_3.3.1 zlibbioc_1.18.0
[7] rpart_4.1-10 RSQLite_1.0.0 annotate_1.50.0
[10] tibble_1.2 gtable_0.2.0 lattice_0.20-34
[13] Matrix_1.2-6 DBI_0.4-1 gridExtra_2.2.1
[16] genefilter_1.56.0 cluster_2.0.5 locfit_1.5-9.1
[19] grid_3.3.1 nnet_7.3-12 data.table_1.10.0
[22] AnnotationDbi_1.36.0 XML_3.98-1.4 survival_2.39-4
[25] BiocParallel_1.6.6 foreign_0.8-67 latticeExtra_0.6-28
[28] Formula_1.2-1 geneplotter_1.50.0 ggplot2_2.2.0
[31] Hmisc_3.17-4 scales_0.4.1 splines_3.3.1
[34] assertthat_0.1 colorspace_1.3-1 xtable_1.8-2
[37] acepack_1.3-3.3 lazyeval_0.2.0 munsell_0.4.3
Hi Mike,
Thank you for the reply
Sorry I didn't understand your answer completely. If I change my FDR from 5%(0,05) to 1% (0.01) how does it affect tthe adjusted p-value(q-value) of the same gene which had cleared in both occasion.
As show above p-value for the same gene was (2.05E-12) in both occasion but the q value changes.
Doesn't the adjust p-value remain same ? and only changes to NA if it doesn't satisfy the FDR cutoff?
The '
alpha
' that you specify toresults()
is passed along to a function that applies a procedure called independent filtering (IF), or also a new method called IHW if you specify that method when running results(). We have a citation for IF in the help page for the results() function, if you want to read more, and it's also discussed in the DESeq2 paper under a section on independent filtering. In the IF case, the procedure finds a threshold for mean of normalized counts, below which the signal is too low for sufficient power in statistical testing. Changing the number of tests (how many pass the filter) changes the adjusted p-value.