p-values histogram in case of no effect
2
0
Entering edit mode
@itamarkanter-7736
Last seen 9.0 years ago
European Union

Hi all,

I have simple experiment of treated vs. untreated cells with two biological replicates.
When I contrast between the biological replicates I find many genes that differentially expressed and the p-values histogram is nicely flat expect a pick in 0.
However when I contrast the treated vs the untreated the p-value histogram have a maximum at 1 and it no longer flat (as expect by chance)
In the PCA plot, 100% of the variance (and 97% when I set ntop=Inf) correspond to the axis that relate to the difference between the biological replicates.
I'm wondering, even in case where the treatment have no effect on the cells, don't the p-value histogram should be flat?

Thanks,
Itamar


ddsATRT <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable[c(-1,-2),],
                                  directory = directory,
                                  design= ~ PS+treatment)%PS stand for biological replicate ("2" and "17")
ddsATRT$treatment<-relevel(ddsATRT$treatment,'UT')
ddsATRT<-DESeq(ddsATRT)
resATRT <- results( ddsATRT, contrast = c("treatment", "BAPN", "UT") ) #BAPN/UT
resATRT_PS <- results( ddsATRT, contrast = c("PS", "2", "17") ) #17/2

> as.data.frame(colData(ddsATRT))
              treatment PS sizeFactor
ATRT_A2_UT           UT  2  0.9171904
ATRT_A2_BAPN       BAPN  2  0.9629325
ATRT_A17_UT          UT 17  0.9984706
ATRT_A17_BAPN      BAPN 17  1.1456955

> sessionInfo()
R version 3.2.0 (2015-04-16)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.2 LTS

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] DESeq2_1.8.1              RcppArmadillo_0.5.100.1.0 Rcpp_0.11.6               GenomicRanges_1.20.3      GenomeInfoDb_1.4.0       
[6] IRanges_2.2.1             S4Vectors_0.6.0           BiocGenerics_0.14.0       BiocInstaller_1.18.1     

loaded via a namespace (and not attached):
 [1] RColorBrewer_1.1-2   futile.logger_1.4.1  plyr_1.8.2           XVector_0.8.0        futile.options_1.0.0 tools_3.2.0         
 [7] rpart_4.1-9          digest_0.6.8         RSQLite_1.0.0        annotate_1.46.0      gtable_0.1.2         lattice_0.20-31     
[13] DBI_0.3.1            proto_0.3-10         gridExtra_0.9.1      genefilter_1.50.0    stringr_1.0.0        cluster_2.0.1       
[19] locfit_1.5-9.1       nnet_7.3-9           grid_3.2.0           Biobase_2.28.0       AnnotationDbi_1.30.1 XML_3.98-1.1        
[25] survival_2.38-1      BiocParallel_1.2.1   foreign_0.8-63       latticeExtra_0.6-26  Formula_1.2-1        geneplotter_1.46.0  
[31] ggplot2_1.0.1        reshape2_1.4.1       lambda.r_1.1.7       magrittr_1.5         scales_0.2.4         Hmisc_3.16-0        
[37] MASS_7.3-39          splines_3.2.0        xtable_1.7-4         colorspace_1.2-6     labeling_0.3         stringi_0.4-1       
[43] acepack_1.3-3.3      munsell_0.4.2   

deseq2 • 823 views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 5 hours ago
United States

The p-value histogram should typically be flat under the null hypothesis (and if you subset out the very low count genes which produce discrete spikes).

"In the PCA plot, 100% of the variance (and 97% when I set ntop=Inf) correspond to the axis that relate to the difference between the biological replicates."

So it sounds like the effect of PS is not null? Then the p-value histogram is not expected to be flat.

ADD COMMENT
0
Entering edit mode
@itamarkanter-7736
Last seen 9.0 years ago
European Union

The PS factor relates to the the biological replicates which look pretty different both in the DE analysis and in the PCA plot.

But the treatment factor which look that almost does not affect the samples(based on the PCA) provide a strange p-value distribution where most of the genes concentrating around p-value=1 (even if the null is true and the treatment have no effect on the samples the distribution should be flat and not to concentrate around 1).

If there is any way in this forum the attach figures? I  wish to show the two p-value histograms.

ADD COMMENT

Login before adding your answer.

Traffic: 776 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6