Search
Question: Deseq2: padj value changes with varying alpha
0
gravatar for yoursbassanio
28 days ago by
yoursbassanio0 wrote:

Hi,

I was running for Deseq2 on the same set of genes and samples with varying alpha as shown below.

Why do adjust p-value changes for the same gene when alpha changes, even though the pvalue remains same?

I thought the alpha is the fdr cutoff and the "NA" means that it didn't cleared the FDR cutoff. Is my interpretation correct?

dds <- DESeq(dds, parallel=TRUE )

alpha <- 0.05

Con2vsCon1 <- results(dds,contrast=c(“Condition”,”Con2”,”Con1”),alpha=alpha)

write.table(Con2vsCon1, "Con2vsCon1_0.05.xls", sep="\t")

 

The result file:

baseMean log2FoldChange lfcSE stat pvalue padj
8650.108665 1.016239114 0.144542498 7.030728877 2.05E-12 1.55E-08
423.1219264 0.935612819 0.13608645 6.875135773 6.19E-12 2.34E-08
36.5910756 1.247356344 0.189428319 6.58484618 4.55E-11 1.15E-07
364.305668 0.815937662 0.125767764 6.487653407 8.72E-11 1.35E-07
2463.544709 0.835643838 0.128866592 6.48456536 8.90E-11 1.35E-07
11.02946987 -0.000463788 0.242314322 -0.001913993 0.998472856 0.998737002
alpha <- 0.01

Con2vsCon1 <- results(dds,contrast=c(“Condition”,”Con2”,”Con1”),alpha=alpha)

write.table(Con2vsCon1, "Con2vsCon1_0.01.xls", sep="\t")

 

baseMean log2FoldChangen lfcSEn statn pvaluen padj
8650.108665 1.016239114 0.144542498 7.030728877 2.05E-12 6.56E-09
423.1219264 0.935612819 0.13608645 6.875135773 6.19E-12 9.89E-09
36.5910756 1.247356344 0.189428319 6.58484618 4.55E-11 4.85E-08
364.305668 0.815937662 0.125767764 6.487653407 8.72E-11 5.68E-08
2463.544709 0.835643838 0.128866592 6.48456536 8.90E-11 5.68E-08
11.02946987 -0.000463788 0.242314322 -0.001913993 0.998472856 NA

 

Session info

R version 3.3.1 (2016-06-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)
locale:
[1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
[3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
[5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
[7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
[9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets
[8] methods   base     

other attached packages:
[1] pheatmap_1.0.7             DESeq2_1.12.4             
[3] SummarizedExperiment_1.4.0 Biobase_2.32.0            
[5] GenomicRanges_1.26.1       GenomeInfoDb_1.8.7        
[7] IRanges_2.8.0              S4Vectors_0.12.0          
[9] BiocGenerics_0.20.0       

loaded via a namespace (and not attached):
[1] Rcpp_0.12.8          RColorBrewer_1.1-2   plyr_1.8.4          
[4] XVector_0.12.1       tools_3.3.1          zlibbioc_1.18.0     
[7] rpart_4.1-10         RSQLite_1.0.0        annotate_1.50.0     
[10] tibble_1.2           gtable_0.2.0         lattice_0.20-34     
[13] Matrix_1.2-6         DBI_0.4-1            gridExtra_2.2.1     
[16] genefilter_1.56.0    cluster_2.0.5        locfit_1.5-9.1      
[19] grid_3.3.1           nnet_7.3-12          data.table_1.10.0   
[22] AnnotationDbi_1.36.0 XML_3.98-1.4         survival_2.39-4     
[25] BiocParallel_1.6.6   foreign_0.8-67       latticeExtra_0.6-28
[28] Formula_1.2-1        geneplotter_1.50.0   ggplot2_2.2.0       
[31] Hmisc_3.17-4         scales_0.4.1         splines_3.3.1       
[34] assertthat_0.1       colorspace_1.3-1     xtable_1.8-2        
[37] acepack_1.3-3.3      lazyeval_0.2.0       munsell_0.4.3

 

ADD COMMENTlink modified 28 days ago by Michael Love14k • written 28 days ago by yoursbassanio0
1
gravatar for Michael Love
28 days ago by
Michael Love14k
United States
Michael Love14k wrote:

What changes when you change alpha is the independent filtering. If you look in ?results for the explanation of the argument alpha:

   alpha: the significance cutoff used for optimizing the independent
          filtering (by default 0.1). If the adjusted p-value cutoff
          (FDR) will be a value other than 0.1, ‘alpha’ should be set
          to that value.

The recommendation is to set alpha to the value that will be used to threshold adjusted p-values, because that is what the independent filtering is trying to optimize (and likewise for IHW). For independent filtering, this chosen value affects the filter on mean of normalized counts for discarding genes with too little signal for testing. Removing more genes will lower the adjusted p-values for genes that survive the filter.

ADD COMMENTlink written 28 days ago by Michael Love14k

Hi Mike,

Thank you for the reply

Sorry I didn't understand your answer completely. If I change my FDR from 5%(0,05) to 1% (0.01) how does it affect tthe adjusted p-value(q-value) of the same gene which had cleared in both occasion. 

As show above p-value for the same gene was (2.05E-12) in both occasion but the q value changes.

Doesn't the adjust p-value remain same ? and only changes to NA if it doesn't satisfy the FDR cutoff?

ADD REPLYlink modified 28 days ago • written 28 days ago by yoursbassanio0
1

The 'alpha' that you specify to results() is passed along to a function that applies a procedure called independent filtering (IF), or also a new method called IHW if you specify that method when running results(). We have a citation for IF in the help page for the results() function, if you want to read more, and it's also discussed in the DESeq2 paper under a section on independent filtering. In the IF case, the procedure finds a threshold for mean of normalized counts, below which the signal is too low for sufficient power in statistical testing. Changing the number of tests (how many pass the filter) changes the adjusted p-value.

ADD REPLYlink written 28 days ago by Michael Love14k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 275 users visited in the last hour