Deseq2: padj value changes with varying alpha
1
0
Entering edit mode
@yoursbassanio-12717
Last seen 4.3 years ago

Hi,

I was running for Deseq2 on the same set of genes and samples with varying alpha as shown below.

Why do adjust p-value changes for the same gene when alpha changes, even though the pvalue remains same?

I thought the alpha is the fdr cutoff and the "NA" means that it didn't cleared the FDR cutoff. Is my interpretation correct?

dds <- DESeq(dds, parallel=TRUE )

alpha <- 0.05

Con2vsCon1 <- results(dds,contrast=c(“Condition”,”Con2”,”Con1”),alpha=alpha)

write.table(Con2vsCon1, "Con2vsCon1_0.05.xls", sep="\t")

 

The result file:

baseMean log2FoldChange lfcSE stat pvalue padj
8650.108665 1.016239114 0.144542498 7.030728877 2.05E-12 1.55E-08
423.1219264 0.935612819 0.13608645 6.875135773 6.19E-12 2.34E-08
36.5910756 1.247356344 0.189428319 6.58484618 4.55E-11 1.15E-07
364.305668 0.815937662 0.125767764 6.487653407 8.72E-11 1.35E-07
2463.544709 0.835643838 0.128866592 6.48456536 8.90E-11 1.35E-07
11.02946987 -0.000463788 0.242314322 -0.001913993 0.998472856 0.998737002
alpha <- 0.01

Con2vsCon1 <- results(dds,contrast=c(“Condition”,”Con2”,”Con1”),alpha=alpha)

write.table(Con2vsCon1, "Con2vsCon1_0.01.xls", sep="\t")

 

baseMean log2FoldChangen lfcSEn statn pvaluen padj
8650.108665 1.016239114 0.144542498 7.030728877 2.05E-12 6.56E-09
423.1219264 0.935612819 0.13608645 6.875135773 6.19E-12 9.89E-09
36.5910756 1.247356344 0.189428319 6.58484618 4.55E-11 4.85E-08
364.305668 0.815937662 0.125767764 6.487653407 8.72E-11 5.68E-08
2463.544709 0.835643838 0.128866592 6.48456536 8.90E-11 5.68E-08
11.02946987 -0.000463788 0.242314322 -0.001913993 0.998472856 NA

 

Session info

R version 3.3.1 (2016-06-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)
locale:
[1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
[3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
[5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
[7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
[9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets
[8] methods   base     

other attached packages:
[1] pheatmap_1.0.7             DESeq2_1.12.4             
[3] SummarizedExperiment_1.4.0 Biobase_2.32.0            
[5] GenomicRanges_1.26.1       GenomeInfoDb_1.8.7        
[7] IRanges_2.8.0              S4Vectors_0.12.0          
[9] BiocGenerics_0.20.0       

loaded via a namespace (and not attached):
[1] Rcpp_0.12.8          RColorBrewer_1.1-2   plyr_1.8.4          
[4] XVector_0.12.1       tools_3.3.1          zlibbioc_1.18.0     
[7] rpart_4.1-10         RSQLite_1.0.0        annotate_1.50.0     
[10] tibble_1.2           gtable_0.2.0         lattice_0.20-34     
[13] Matrix_1.2-6         DBI_0.4-1            gridExtra_2.2.1     
[16] genefilter_1.56.0    cluster_2.0.5        locfit_1.5-9.1      
[19] grid_3.3.1           nnet_7.3-12          data.table_1.10.0   
[22] AnnotationDbi_1.36.0 XML_3.98-1.4         survival_2.39-4     
[25] BiocParallel_1.6.6   foreign_0.8-67       latticeExtra_0.6-28
[28] Formula_1.2-1        geneplotter_1.50.0   ggplot2_2.2.0       
[31] Hmisc_3.17-4         scales_0.4.1         splines_3.3.1       
[34] assertthat_0.1       colorspace_1.3-1     xtable_1.8-2        
[37] acepack_1.3-3.3      lazyeval_0.2.0       munsell_0.4.3

 

deseq2 counts rnaseq • 4.3k views
ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 6 days ago
United States

What changes when you change alpha is the independent filtering. If you look in ?results for the explanation of the argument alpha:

   alpha: the significance cutoff used for optimizing the independent
          filtering (by default 0.1). If the adjusted p-value cutoff
          (FDR) will be a value other than 0.1, ‘alpha’ should be set
          to that value.

The recommendation is to set alpha to the value that will be used to threshold adjusted p-values, because that is what the independent filtering is trying to optimize (and likewise for IHW). For independent filtering, this chosen value affects the filter on mean of normalized counts for discarding genes with too little signal for testing. Removing more genes will lower the adjusted p-values for genes that survive the filter.

ADD COMMENT
0
Entering edit mode

Hi Mike,

Thank you for the reply

Sorry I didn't understand your answer completely. If I change my FDR from 5%(0,05) to 1% (0.01) how does it affect tthe adjusted p-value(q-value) of the same gene which had cleared in both occasion. 

As show above p-value for the same gene was (2.05E-12) in both occasion but the q value changes.

Doesn't the adjust p-value remain same ? and only changes to NA if it doesn't satisfy the FDR cutoff?

ADD REPLY
1
Entering edit mode

The 'alpha' that you specify to results() is passed along to a function that applies a procedure called independent filtering (IF), or also a new method called IHW if you specify that method when running results(). We have a citation for IF in the help page for the results() function, if you want to read more, and it's also discussed in the DESeq2 paper under a section on independent filtering. In the IF case, the procedure finds a threshold for mean of normalized counts, below which the signal is too low for sufficient power in statistical testing. Changing the number of tests (how many pass the filter) changes the adjusted p-value.

ADD REPLY

Login before adding your answer.

Traffic: 562 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6