Question: Fwd: Conservative results using DEXSeq
0
5.8 years ago by
Levi Waldron920
CUNY Graduate School of Public Health and Health Policy, New York, NY
Levi Waldron920 wrote:
I have noticed the kind of p-value histograms that Gu describes in other situations also, even using the same technologies and bioinformatic methods as other situations where it doesn't occur. I am not sure why it happened, but it could have to do with a batch effect that is *not* confounded with the outcome variable? As an example I'm attaching raw p-value histograms of Cox regressions for each of 14 ovarian cancer datasets, code below. At least one of these has the monotonic increase described. This experiment used the same microarray platform as many of the other datasets (Affy hgu133plus2), but is the only experiment using microdissected tissues. Point is just that the effect could be magnified some reason relating to the experiment. library(survival) library(affy) library(curatedOvarianData) if( !require("survHD") || packageVersion("survHD") != "0.99.1" ){ library(devtools) install_url(" https://bitbucket.org/lwaldron/survhd/downloads/survHD_0.99.1.tar.gz") } source(system.file("extdata", "patientselection.config",package="curatedOvarianData")) source(system.file("extdata", "createEsetList.R", package = "curatedOvarianData")) pvals <- lapply(esets, function(eset) rowCoxTests(exprs(eset), eset$y)[, 3]) png("Cox_p-values.png") par(mfrow=c(4, 4)) for (i in 1:length(pvals)) hist(pvals[[i]], main=names(pvals)[i], xlab="raw p-value") dev.off() On Wed, Jul 24, 2013 at 3:55 AM, Simon Anders <anders at="" embl.de=""> wrote: > Hi > > > On 23/07/13 14:47, Gu [guest] wrote: > >> By checking the histogram of raw p-values of exons (NOT genes), I >> find that it is monotonically increasing from 0 to 1, with relatively >> few counting bins falling into the bins from 0 to 0.2. >> > > You are right, DEXSeq sometimes tends to be overly conservative, which > then results in a skewed p value histogram as you describe it. Usually, it > is, however, only a rather slight skew, and it seems that the performance > is unusually bad for your specific dataset. > > The main reason for the conservative results is the way we estimate > dispersion. Since the release of DEXSeq, we have made quite some progress > in improving the dispersion estimation by now using an empirical- Bayes > shrinkage estimator, and DESeq2 now offers a much better solution, at least > for gene-level tests. We are working on applying the same changes to > DEXSeq, and this should solve your issue. I'm afraid, however, that I have > to ask you for some patience until we are finished with these changes. > > Simon > > > ______________________________**_________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.et="" hz.ch="" mailman="" listinfo="" bioconductor=""> > Search the archives: http://news.gmane.org/gmane.** > science.biology.informatics.**conductor<http: news.gmane.org="" gmane.="" science.biology.informatics.conductor=""> > -------------- next part -------------- A non-text attachment was scrubbed... Name: Cox_p-values.png Type: image/png Size: 13580 bytes Desc: not available URL: <https: stat.ethz.ch="" pipermail="" bioconductor="" attachments="" 20130724="" b36b28fd="" attachment.png=""> cancer ovarian dexseq deseq2 • 628 views ADD COMMENTlink modified 5.8 years ago by Wolfgang Huber13k • written 5.8 years ago by Levi Waldron920 Answer: Fwd: Conservative results using DEXSeq 0 5.8 years ago by EMBL European Molecular Biology Laboratory Wolfgang Huber13k wrote: Dear Levi thanks, you are right, batch effects can lead to excessive within- group vs between-group variation and thus p-value distributions that are more concentrated towards 1 than uniform. Such an effect could play a role in addition to the one that Simon described. In Gu's case, further diagnostics are needed to disentangle and potentially fix the problem. Best wishes Wolfgang On 24 Jul 2013, at 17:06, Levi Waldron <lwaldron.research at="" gmail.com=""> wrote: > I have noticed the kind of p-value histograms that Gu describes in other > situations also, even using the same technologies and bioinformatic methods > as other situations where it doesn't occur. I am not sure why it happened, > but it could have to do with a batch effect that is *not* confounded with > the outcome variable? > > As an example I'm attaching raw p-value histograms of Cox regressions for > each of 14 ovarian cancer datasets, code below. At least one of these has > the monotonic increase described. This experiment used the same microarray > platform as many of the other datasets (Affy hgu133plus2), but is the only > experiment using microdissected tissues. Point is just that the effect > could be magnified some reason relating to the experiment. > > library(survival) > library(affy) > library(curatedOvarianData) > if( !require("survHD") || packageVersion("survHD") != "0.99.1" ){ > library(devtools) > install_url(" > https://bitbucket.org/lwaldron/survhd/downloads/survHD_0.99.1.tar.gz") > } > > > source(system.file("extdata", > "patientselection.config",package="curatedOvarianData")) > source(system.file("extdata", "createEsetList.R", package = > "curatedOvarianData")) > > pvals <- lapply(esets, function(eset) rowCoxTests(exprs(eset), eset$y)[, 3]) > > png("Cox_p-values.png") > par(mfrow=c(4, 4)) > for (i in 1:length(pvals)) > hist(pvals[[i]], main=names(pvals)[i], xlab="raw p-value") > dev.off() > > > > On Wed, Jul 24, 2013 at 3:55 AM, Simon Anders <anders at="" embl.de=""> wrote: > >> Hi >> >> >> On 23/07/13 14:47, Gu [guest] wrote: >> >>> By checking the histogram of raw p-values of exons (NOT genes), I >>> find that it is monotonically increasing from 0 to 1, with relatively >>> few counting bins falling into the bins from 0 to 0.2. >>> >> >> You are right, DEXSeq sometimes tends to be overly conservative, which >> then results in a skewed p value histogram as you describe it. Usually, it >> is, however, only a rather slight skew, and it seems that the performance >> is unusually bad for your specific dataset. >> >> The main reason for the conservative results is the way we estimate >> dispersion. Since the release of DEXSeq, we have made quite some progress >> in improving the dispersion estimation by now using an empirical- Bayes >> shrinkage estimator, and DESeq2 now offers a much better solution, at least >> for gene-level tests. We are working on applying the same changes to >> DEXSeq, and this should solve your issue. I'm afraid, however, that I have >> to ask you for some patience until we are finished with these changes. >> >> Simon >> >> >> ______________________________**_________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.e="" thz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: http://news.gmane.org/gmane.** >> science.biology.informatics.**conductor<http: news.gmane.org="" gmane="" .science.biology.informatics.conductor=""> >> > <cox_p-values.png>_______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor