Top 10% of genes based on p-value in TopTable
2
0
Entering edit mode
Voke AO ▴ 760
@voke-ao-4830
Last seen 9.6 years ago
Hi all, I'm curious to know how I can get and highlight the top 10% of the genes based on p-values that I get from my limma analysis in a volcano plot. I can get the genes highlighted based on an absolute logFC >2 and a p-value<0.01(code below) but I would like to have an idea of the number of genes in the top 10% based simply on p-values. Any help will be greatly appreciated. Thanks. -Avoks results$threshold = as.factor(abs(results$logFC) > 2 & results$P.Value < 0.01) windows() pdf("VolcanoPlot_GSE25724_9.pdf"); g = ggplot(data=results, aes(x=logFC, y=-log10(P.Value), colour=threshold)) + geom_point(alpha=0.4, size=1.75) + opts(legend.position = "none") + xlim(c(-8, 8)) + ylim(c(0, 10)) + xlab("log2 fold change") + ylab("-log10 p-value") g dev.off()
limma limma • 1.7k views
ADD COMMENT
0
Entering edit mode
Dan Du ▴ 210
@dan-du-5270
Last seen 10 weeks ago
Germany
Hi Ovokeraye, if your current approach works fine, just change the first line would do, results$threshold = as.factor(results$P.Value<=quantile(results$P.Value, 0.1)) and btw, limma does provide a function volcanoplot to do exactly the same thing. HTH Dan On Mon, 2012-05-07 at 14:15 +0200, Ovokeraye Achinike-Oduaran wrote: > Hi all, > > I'm curious to know how I can get and highlight the top 10% of the > genes based on p-values that I get from my limma analysis in a volcano > plot. > > I can get the genes highlighted based on an absolute logFC >2 and a > p-value<0.01(code below) but I would like to have an idea of the > number of genes in the top 10% based simply on p-values. > > Any help will be greatly appreciated. > > Thanks. > > -Avoks > > results$threshold = as.factor(abs(results$logFC) > 2 & results$P.Value < 0.01) > windows() > pdf("VolcanoPlot_GSE25724_9.pdf"); > > g = ggplot(data=results, aes(x=logFC, y=-log10(P.Value), colour=threshold)) + > geom_point(alpha=0.4, size=1.75) + > opts(legend.position = "none") + > xlim(c(-8, 8)) + ylim(c(0, 10)) + > xlab("log2 fold change") + ylab("-log10 p-value") > g > dev.off() > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
Thanks Dan. One more question: I want to plot the no. of genes vs p-value, so what goes into the aes portion of the command for the non-p.value axis? Thanks again. -Avoks On Mon, May 7, 2012 at 2:38 PM, Dan Du <tooyoung at="" gmail.com=""> wrote: > Hi Ovokeraye, > > if your current approach works fine, just change the first line would > do, > > results$threshold = as.factor(results$P.Value<=quantile(results$P.Value, > 0.1)) > > and btw, limma does provide a function volcanoplot to do exactly the > same thing. > > HTH > Dan > > On Mon, 2012-05-07 at 14:15 +0200, Ovokeraye Achinike-Oduaran wrote: >> Hi all, >> >> I'm curious to know how I can get and highlight the top 10% of the >> genes based on p-values that I get from my limma analysis in a volcano >> plot. >> >> I can get the genes highlighted based on an absolute logFC >2 and a >> p-value<0.01(code below) but I would like to have an idea of the >> number of genes in the top 10% based simply on p-values. >> >> Any help will be greatly appreciated. >> >> Thanks. >> >> -Avoks >> >> results$threshold = as.factor(abs(results$logFC) > 2 & results$P.Value < 0.01) >> windows() >> pdf("VolcanoPlot_GSE25724_9.pdf"); >> >> g = ggplot(data=results, aes(x=logFC, y=-log10(P.Value), colour=threshold)) + >> ? geom_point(alpha=0.4, size=1.75) + >> ? opts(legend.position = "none") + >> ? xlim(c(-8, 8)) + ylim(c(0, 10)) + >> ? xlab("log2 fold change") + ylab("-log10 p-value") >> g >> dev.off() >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > >
ADD REPLY
0
Entering edit mode
Voke AO ▴ 760
@voke-ao-4830
Last seen 9.6 years ago
Thanks a bunch, Dan. Regards, Avoks. On 08 May 2012, at 1:32 PM, Dan Du <tooyoung at="" gmail.com=""> wrote: > hi Avoks, > > well, that sounds like a totally different plot. > To create such a cutoff benchmarking graph you have to make a new data > structure holding your possible p threshold and the corresponding no. of > significant genes. And this doesnot need ggplot and its advance syntax. > > x<-rnorm(100)^2/10 > cut<-c(0.5,0.1, 0.05,0.01,0.005,0.001,0.0005,0.0001) > plot(cut, sapply(cut, function(y) sum(x<=y)), ylab='No.SIG.DE.Gene', > main='Title') > > HTH, > Dan > > On Mon, 2012-05-07 at 14:55 +0200, Ovokeraye Achinike-Oduaran wrote: >> Thanks Dan. One more question: I want to plot the no. of genes vs >> p-value, so what goes into the aes portion of the command for the >> non-p.value axis? >> >> >> Thanks again. >> >> -Avoks >> >> On Mon, May 7, 2012 at 2:38 PM, Dan Du <tooyoung at="" gmail.com=""> wrote: >>> Hi Ovokeraye, >>> >>> if your current approach works fine, just change the first line would >>> do, >>> >>> results$threshold = as.factor(results$P.Value<=quantile(results$P.Value, >>> 0.1)) >>> >>> and btw, limma does provide a function volcanoplot to do exactly the >>> same thing. >>> >>> HTH >>> Dan >>> >>> On Mon, 2012-05-07 at 14:15 +0200, Ovokeraye Achinike-Oduaran wrote: >>>> Hi all, >>>> >>>> I'm curious to know how I can get and highlight the top 10% of the >>>> genes based on p-values that I get from my limma analysis in a volcano >>>> plot. >>>> >>>> I can get the genes highlighted based on an absolute logFC >2 and a >>>> p-value<0.01(code below) but I would like to have an idea of the >>>> number of genes in the top 10% based simply on p-values. >>>> >>>> Any help will be greatly appreciated. >>>> >>>> Thanks. >>>> >>>> -Avoks >>>> >>>> results$threshold = as.factor(abs(results$logFC) > 2 & results$P.Value < 0.01) >>>> windows() >>>> pdf("VolcanoPlot_GSE25724_9.pdf"); >>>> >>>> g = ggplot(data=results, aes(x=logFC, y=-log10(P.Value), colour=threshold)) + >>>> geom_point(alpha=0.4, size=1.75) + >>>> opts(legend.position = "none") + >>>> xlim(c(-8, 8)) + ylim(c(0, 10)) + >>>> xlab("log2 fold change") + ylab("-log10 p-value") >>>> g >>>> dev.off() >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at r-project.org >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >>> > >
ADD COMMENT

Login before adding your answer.

Traffic: 838 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6