Dear Community,
I'm using DESeq2 to analyze an RNAseq experiement in which I'm evaluating the transcriptional variations mediated by the over expression of some transcription factors.
The experimental design is as follows: 5 conditions (ctrl; x; y; z; x+y+z), and each condition ha 5 replicates.
In a previous question (see the link: https://support.bioconductor.org/p/80178/) I've asked whether the use of the "contrast" argument was appropriate to examine the group-wise differences in expression, and apparently it is correct tool.
However, inspecting the p value histograms (same thing for the pAdj ones) all the comparisons give rise to "U" shaped distributions, with a strong bias towards the "1" limit of the distribution (from 1 to 10 times higher than of the "0" peak).
Here you have the code I use:
condition<-c("x","x","x","x","x","y","y","y","y","y","z","z","z","z","z","xyz","xyz","xyz","xyz","xyz","ctrl","ctrl","ctrl","ctrl","ctrl") #declare my conditions ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = directory,design= ~ condition) #building the dds object ddsHTSeqfilt <- ddsHTSeq[ rowSums(counts(ddsHTSeq)) > 1, ] #filtering 0 values ddsHTSeqfilt$condition <- relevel(ddsHTSeqfilt$condition, ref="ctrl") # specifying to use ctrl as reference library("BiocParallel") register(SnowParam(2)) dds <- DESeq(ddsHTSeqfilt) DEx<-results(dds, contrast=c("condition","x","ctrl"), alpha=0.05, lfcThreshold = 1, altHypothesis = "greaterAbs", independentFiltering = T) DEy<-results(dds, contrast=c("condition","y","ctrl"), alpha=0.05, lfcThreshold = 1, altHypothesis = "greaterAbs", independentFiltering = T) DEz<-results(dds, contrast=c("condition","z","ctrl"), alpha=0.05, lfcThreshold = 1, altHypothesis = "greaterAbs", independentFiltering = T) DExyz<-results(dds, contrast=c("condition","xyz","ctrl"), alpha=0.05, lfcThreshold = 1, altHypothesis = "greaterAbs", independentFiltering = T) sessionInfo() R version 3.2.4 Revised (2016-03-16 r70336) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 x64 (build 9200) locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] parallel stats4 grid stats graphics grDevices utils datasets methods [10] base other attached packages: [1] fdrtool_1.2.15 BiocInstaller_1.20.3 org.Mm.eg.db_3.2.3 [4] RSQLite_1.0.0 DBI_0.3.1 AnnotationDbi_1.32.3 [7] genefilter_1.52.1 PoiClaClu_1.0.2 RColorBrewer_1.1-2 [10] reshape2_1.4.1 DESeq2_1.10.1 RcppArmadillo_0.6.700.3.0 [13] Rcpp_0.12.4 SummarizedExperiment_1.0.2 Biobase_2.30.0 [16] GenomicRanges_1.22.4 GenomeInfoDb_1.6.3 IRanges_2.4.8 [19] S4Vectors_0.8.11 BiocGenerics_0.16.1 ggplot2_2.1.0 [22] pheatmap_1.0.8 rmarkdown_0.9.5 circlize_0.3.5 [25] ComplexHeatmap_1.6.0 MASS_7.3-45 ReporteRs_0.8.6 [28] ReporteRsjars_0.0.2 BiocParallel_1.4.3 FactoMineR_1.32 [31] rJava_0.9-8
As far as I understood, reading the nice tutorial from Bernd Klaus and Wolfgang Huber (link: http://www-huber.embl.de/users/klaus/Teaching/DESeq2Predoc2014.html#inspection-and-correction-of-pvalues), having a "U" skewed distribution of p values represents an underestimation of the gene-wise variance, and theoretically it should be possible to correct it using the "fdrtool" package. However this approach didn't work so far. Here you can see the code:
DEx <- DEx[ !is.na(DEx$padj), ] DEx <- DEx[ !is.na(DEx$pvalue), ] DEx <- DEx[, -which(names(DEx) == "padj")] FDR.DEx <- fdrtool(DEx$stat, statistic= "normal", plot = T)
which ends up with the following error message:
Step 1... determine cutoff point Step 2... estimate parameters of null distribution and eta0 Error in optimize(nlogL, lower = lo, upper = up) : 'xmin' not less than 'xmax'
I've inspected the data using PCA and several heat maps (for different genes sub-groups) and the samples have some obvious differences. However, I'm afraid to that having an abnormal "U" shaped p values distribution might lead to some misleading conclusion.
I'd really appreciate the help of the community here.
Thanks in advance for any help.
All the best.
Seba