#### The support.bioconductor.org editor has been updated to markdown! Please see more info at: Tutorial: Updated Support Site Editor

Question: QQ plot 450k data
0
4.8 years ago by
Hi All, What would be the right distribution to use as expected p values in a QQ plot for results from 450k analysis? I have been searching, but not able to find in mentioned anywhere. Thanks in advance, Khadeeja [[alternative HTML version deleted]]
• 966 views
modified 4.8 years ago by James W. MacDonald49k • written 4.8 years ago by khadeeja ismail400
Answer: QQ plot 450k data
0
4.8 years ago by
United States
James W. MacDonald49k wrote:
Hi Khadeeja, The distribution of p-values under the null hypothesis should always be uniform. Best, Jim On 5/19/2014 5:14 PM, khadeeja ismail wrote: > Hi All, > > What would be the right distribution to use as expected p values in a QQ plot for results from 450k analysis? I have been searching, but not able to find in mentioned anywhere. > > > Thanks in advance, > > Khadeeja > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
Hi Jim, Jim, if (groups) of samples differ in their global methylation profile (i.e. the majority of probes significantly differs in one direction), would you not expect to see a deviation from this uniform distribution then? Best, Martin -- M.A. (Martin) Rijlaarsdam MSc. MD Erasmus MC - University Medical Center Rotterdam Department of Pathology Room Be-432b Shipping adress: P.O. Box 2040, 3000 CA Rotterdam, The Netherlands Visiting adress: Dr. Molewaterplein 50, 3015 GE Rotterdam, The Netherlands Email: m.a.rijlaarsdam@gmail.com Mobile: +31 6 45408508 Telephone (work): +31 10 7033409 Fax +31 10 7044365 Website: http://www.martinrijlaarsdam.nl On Tue, May 20, 2014 at 3:48 PM, James W. MacDonald <jmacdon@uw.edu> wrote: > Hi Khadeeja, > > The distribution of p-values under the null hypothesis should always be > uniform. > > Best, > > Jim > > > On 5/19/2014 5:14 PM, khadeeja ismail wrote: > >> Hi All, >> >> What would be the right distribution to use as expected p values in a QQ >> plot for results from 450k analysis? I have been searching, but not able to >> find in mentioned anywhere. >> >> >> Thanks in advance, >> >> Khadeeja >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane. >> science.biology.informatics.conductor >> > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane. > science.biology.informatics.conductor > [[alternative HTML version deleted]]
Hi all, I think this depends on what question Khadeeja wants to answer. Traditionally a qqplot of the p-values from many statistical tests is used to assess the amount of signal in the data, i.e. low p-values (though this could be due either to true biological signal or biases). I assume this is what she wants to answer. The p-values from a genomic study can be compared to the uniform distribution using the following code: library("qqman") qq(pvals) where pvals is a vector of the p-values. If there is a strong signal in the data, we do expect a deviation from the uniform distribution. I am not aware of a potential use for knowing the exact distribution of these p-values. John On Tue, May 20, 2014 at 8:51 AM, Martin Rijlaarsdam < m.a.rijlaarsdam@gmail.com> wrote: > Hi Jim, > > Jim, if (groups) of samples differ in their global methylation profile > (i.e. the majority of probes significantly differs in one direction), would > you not expect to see a deviation from this uniform distribution then? > > Best, > Martin > > > -- > M.A. (Martin) Rijlaarsdam MSc. MD > Erasmus MC - University Medical Center Rotterdam > Department of Pathology > Room Be-432b > Shipping adress: P.O. Box 2040, 3000 CA Rotterdam, The Netherlands > Visiting adress: Dr. Molewaterplein 50, 3015 GE Rotterdam, The Netherlands > > Email: m.a.rijlaarsdam@gmail.com > Mobile: +31 6 45408508 > Telephone (work): +31 10 7033409 > Fax +31 10 7044365 > Website: http://www.martinrijlaarsdam.nl > > > On Tue, May 20, 2014 at 3:48 PM, James W. MacDonald <jmacdon@uw.edu> > wrote: > > > Hi Khadeeja, > > > > The distribution of p-values under the null hypothesis should always be > > uniform. > > > > Best, > > > > Jim > > > > > > On 5/19/2014 5:14 PM, khadeeja ismail wrote: > > > >> Hi All, > >> > >> What would be the right distribution to use as expected p values in a QQ > >> plot for results from 450k analysis? I have been searching, but not > able to > >> find in mentioned anywhere. > >> > >> > >> Thanks in advance, > >> > >> Khadeeja > >> [[alternative HTML version deleted]] > >> > >> _______________________________________________ > >> Bioconductor mailing list > >> Bioconductor@r-project.org > >> https://stat.ethz.ch/mailman/listinfo/bioconductor > >> Search the archives: http://news.gmane.org/gmane. > >> science.biology.informatics.conductor > >> > > > > -- > > James W. MacDonald, M.S. > > Biostatistician > > University of Washington > > Environmental and Occupational Health Sciences > > 4225 Roosevelt Way NE, # 100 > > Seattle WA 98105-6099 > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@r-project.org > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: http://news.gmane.org/gmane. > > science.biology.informatics.conductor > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
Hi Martin, That's exactly what you would expect to see, which is why you do a QQ plot in the first place. On one axis you plot the expected (uniform) distribution of p-values, and on the other axis you plot the observed p-values. The p-values that deviate from the expectation are then possible true positives. Best, Jim On 5/20/2014 9:51 AM, Martin Rijlaarsdam wrote: > Hi Jim, > > Jim, if (groups) of samples differ in their global methylation profile > (i.e. the majority of probes significantly differs in one direction), would > you not expect to see a deviation from this uniform distribution then? > > Best, > Martin > > > -- > M.A. (Martin) Rijlaarsdam MSc. MD > Erasmus MC - University Medical Center Rotterdam > Department of Pathology > Room Be-432b > Shipping adress: P.O. Box 2040, 3000 CA Rotterdam, The Netherlands > Visiting adress: Dr. Molewaterplein 50, 3015 GE Rotterdam, The Netherlands > > Email: m.a.rijlaarsdam at gmail.com > Mobile: +31 6 45408508 > Telephone (work): +31 10 7033409 > Fax +31 10 7044365 > Website: http://www.martinrijlaarsdam.nl > > > On Tue, May 20, 2014 at 3:48 PM, James W. MacDonald <jmacdon at="" uw.edu=""> wrote: > >> Hi Khadeeja, >> >> The distribution of p-values under the null hypothesis should always be >> uniform. >> >> Best, >> >> Jim >> >> >> On 5/19/2014 5:14 PM, khadeeja ismail wrote: >> >>> Hi All, >>> >>> What would be the right distribution to use as expected p values in a QQ >>> plot for results from 450k analysis? I have been searching, but not able to >>> find in mentioned anywhere. >>> >>> >>> Thanks in advance, >>> >>> Khadeeja >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: http://news.gmane.org/gmane. >>> science.biology.informatics.conductor >>> >> -- >> James W. MacDonald, M.S. >> Biostatistician >> University of Washington >> Environmental and Occupational Health Sciences >> 4225 Roosevelt Way NE, # 100 >> Seattle WA 98105-6099 >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane. >> science.biology.informatics.conductor >> > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD REPLYlink written 4.8 years ago by James W. MacDonald49k
Thank you very much. I was suggested QQ plots to look for biases in two different analysis. One seem to have a uniform distribution while the other seem to deviate much. Probably due to different global methylation profiles? Can I conclude from it that there is a bias? Thanks again, Khadeeja On Tuesday, May 20, 2014 5:14 PM, James W. MacDonald <jmacdon@uw.edu> wrote: Hi Martin, That's exactly what you would expect to see, which is why you do a QQ plot in the first place. On one axis you plot the expected (uniform) distribution of p-values, and on the other axis you plot the observed p-values. The p-values that deviate from the expectation are then possible true positives. Best, Jim On 5/20/2014 9:51 AM, Martin Rijlaarsdam wrote: > Hi Jim, > > Jim, if (groups) of samples differ in their global methylation profile > (i.e. the majority of probes significantly differs in one direction), would > you not expect to see a deviation from this uniform distribution then? > > Best, > Martin > > > -- > M.A. (Martin) Rijlaarsdam MSc. MD > Erasmus MC - University Medical Center Rotterdam > Department of Pathology > Room Be-432b > Shipping adress: P.O. Box 2040, 3000 CA Rotterdam, The Netherlands > Visiting adress: Dr. Molewaterplein 50, 3015 GE Rotterdam, The Netherlands > > Email: m.a.rijlaarsdam@gmail.com > Mobile: +31 6 45408508 > Telephone (work): +31 10 7033409 > Fax +31 10 7044365 > Website: http://www.martinrijlaarsdam.nl > > > On Tue, May 20, 2014 at 3:48 PM, James W. MacDonald <jmacdon@uw.edu> wrote: > >> Hi Khadeeja, >> >> The distribution of p-values under the null hypothesis should always be >> uniform. >> >> Best, >> >> Jim >> >> >> On 5/19/2014 5:14 PM, khadeeja ismail wrote: >> >>> Hi All, >>> >>> What would be the right distribution to use as expected p values in a QQ >>> plot for results from 450k analysis? I have been searching, but not able to >>> find in mentioned anywhere. >>> >>> >>> Thanks in advance, >>> >>> Khadeeja >>>          [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor@r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: http://news.gmane.org/gmane. >>> science.biology.informatics.conductor >>> >> -- >> James W. MacDonald, M.S. >> Biostatistician >> University of Washington >> Environmental and Occupational Health Sciences >> 4225 Roosevelt Way NE, # 100 >> Seattle WA 98105-6099 >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane. >> science.biology.informatics.conductor >> >     [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099 [[alternative HTML version deleted]]