limma and Rank Products: comparison of the number of results

0

Entering edit mode

Juan C Oliveros Collazos ▴ 190

@juan-c-oliveros-collazos-2665

Last seen 9.6 years ago

Dear all When working with comparative experiment based on Affymetrix gene expression arrays I usually apply one of the following combination of methods: RMA + limma + FDR or RMA+ Rank Products (rank products "Percentage of false prediction" values are supposed to be equivalent to FDR) Usually we obtain much more differentially expressed genes when using Rank Products than when using limma at the same FDR threshold. I wonder if in your case is the same. Do you obtain many more results with Rank Products than with limma at the same FDR cutoff? In a recent experiment we obtained the opposite (more results with limma) and I'd like to know your experience when using both methods regarding the number of results. best, Juan Carlos Oliveros, Ph.D. CNB-CSIC, Madrid, Spain

limma limma • 1.5k views

ADD COMMENT • link updated 14.2 years ago by Thomas Hampton ▴ 20 • written 14.2 years ago by Juan C Oliveros Collazos ▴ 190

0

Entering edit mode

Thomas Hampton ▴ 20

@thomas-hampton-3936

Last seen 9.6 years ago

I have used rank products ad limma head to head many times, so I think I understand your question. I can imagine scenarios where limma would provide a longer list, but I have not observed it myself. It is in many ways amazing that the two tests generate similar lists at all, given the enormous difference in the way they go about selecting genes. Limma "cares" quite a lot about within group variation, and can select genes with very small between group differences as significant as long as within group variation is sufficiently small. Genes with smaller p values appear at the head of the list in limma, where genes with the largest fold differences appear at the head of a RankProd list. The net of all this is that I believe you will find that the two lists are not only different in length, but quite different in order. So whether "false prediction rates" and "false discovery rates" are arguably similar or not, the two tests make rather different assumptions which will ultimately drive the two estimates apart. I generally feel that "longer is better" when it comes to gene lists, especially for pathway analysis, but I generally require a difference of 1.4 fold between conditions before I get enthusiastic. Best, Tom T On Feb 17, 2010, at 9:10 AM, Juan Carlos Oliveros wrote: > Dear all > > When working with comparative experiment based on Affymetrix gene > expression arrays I usually apply one of the following combination > of methods: > > RMA + limma + FDR > > or > > RMA+ Rank Products > (rank products "Percentage of false prediction" values are supposed > to be equivalent to FDR) > > Usually we obtain much more differentially expressed genes when > using Rank Products than when using limma at the same FDR threshold. > > I wonder if in your case is the same. Do you obtain many more > results with Rank Products than with limma at the same FDR cutoff? > > In a recent experiment we obtained the opposite (more results with > limma) and I'd like to know your experience when using both methods > regarding the number of results. > > best, > > Juan Carlos Oliveros, Ph.D. > CNB-CSIC, Madrid, Spain > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD COMMENT • link 14.2 years ago Thomas Hampton ▴ 20

0

Entering edit mode

Tom, Thanks for your message. Your explanations about the differences between the two methods are very clear and will help us to understand our results. Cheers, Juan Carlos Thomas Hampton wrote: > I have used rank products ad limma head to head many times, so I think > I understand your question. > > I can imagine scenarios where limma would provide a longer list, but I > have not observed it myself. > > It is in many ways amazing that the two tests generate similar lists > at all, given the enormous > difference in the way they go about selecting genes. Limma "cares" > quite a lot about within > group variation, and can select genes with very small between group > differences as significant > as long as within group variation is sufficiently small. Genes with > smaller p values appear at the > head of the list in limma, where genes with the largest fold > differences appear at the head of > a RankProd list. The net of all this is that I believe you will find > that the two lists are not only different > in length, but quite different in order. So whether "false prediction > rates" and "false discovery rates" are > arguably similar or not, the two tests make rather different > assumptions which will ultimately drive the two > estimates apart. I generally feel that "longer is better" when it > comes to gene lists, especially for pathway > analysis, but I generally require a difference of 1.4 fold between > conditions before I get enthusiastic. > > Best, > > > Tom > > > > T > On Feb 17, 2010, at 9:10 AM, Juan Carlos Oliveros wrote: > >> Dear all >> >> When working with comparative experiment based on Affymetrix gene >> expression arrays I usually apply one of the following combination of >> methods: >> >> RMA + limma + FDR >> >> or >> >> RMA+ Rank Products >> (rank products "Percentage of false prediction" values are supposed >> to be equivalent to FDR) >> >> Usually we obtain much more differentially expressed genes when using >> Rank Products than when using limma at the same FDR threshold. >> >> I wonder if in your case is the same. Do you obtain many more results >> with Rank Products than with limma at the same FDR cutoff? >> >> In a recent experiment we obtained the opposite (more results with >> limma) and I'd like to know your experience when using both methods >> regarding the number of results. >> >> best, >> >> Juan Carlos Oliveros, Ph.D. >> CNB-CSIC, Madrid, Spain >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >

ADD REPLY • link 14.2 years ago Juan C Oliveros Collazos ▴ 190

0

Entering edit mode

Wolfgang Huber ★ 13k

@wolfgang-huber-3550

Last seen 16 days ago

EMBL European Molecular Biology Laborat…

Juan did you verify whether your estimates of "FDR" or "Percentage of false prediction", in the way you apply them, are actually accurate, e.g. by independent validation experiments, or by applying your method to biological replicates (where you know that all discoveries are in fact false)? Best wishes Wolfgang Il giorno Feb 17, 2010, alle ore 3:10 PM, Juan Carlos Oliveros ha scritto: Dear all When working with comparative experiment based on Affymetrix gene expression arrays I usually apply one of the following combination of methods: RMA + limma + FDR or RMA+ Rank Products (rank products "Percentage of false prediction" values are supposed to be equivalent to FDR) Usually we obtain much more differentially expressed genes when using Rank Products than when using limma at the same FDR threshold. I wonder if in your case is the same. Do you obtain many more results with Rank Products than with limma at the same FDR cutoff? In a recent experiment we obtained the opposite (more results with limma) and I'd like to know your experience when using both methods regarding the number of results. best, Juan Carlos Oliveros, Ph.D. CNB-CSIC, Madrid, Spain _______________________________________________ Bioconductor mailing list Bioconductor at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor Wolfgang Huber whuber at embl.de

ADD COMMENT • link 14.2 years ago Wolfgang Huber ★ 13k

0

Entering edit mode

Wolfgang, We apply this kind of methods just to find candidates to be confirmed experimentally in posterior analysis. We expect to have a percentage of false positives which is better for us than having false negatives. But with independence of the cutoff, we observed that --in this case-- Rank Products provides less statistically significant results than limma which was not the case in the past. I just want to know if people that use both methods obtain the same differences. Thanks a lot. Juan Carlos Wolfgang Huber wrote: > Juan > > did you verify whether your estimates of "FDR" or "Percentage of false prediction", in the way you apply them, are actually accurate, e.g. by independent validation experiments, or by applying your method to biological replicates (where you know that all discoveries are in fact false)? > > Best wishes > Wolfgang > > > Il giorno Feb 17, 2010, alle ore 3:10 PM, Juan Carlos Oliveros ha scritto: > > Dear all > > When working with comparative experiment based on Affymetrix gene expression arrays I usually apply one of the following combination of methods: > > RMA + limma + FDR > > or > > RMA+ Rank Products > (rank products "Percentage of false prediction" values are supposed to be equivalent to FDR) > > Usually we obtain much more differentially expressed genes when using Rank Products than when using limma at the same FDR threshold. > > I wonder if in your case is the same. Do you obtain many more results with Rank Products than with limma at the same FDR cutoff? > > In a recent experiment we obtained the opposite (more results with limma) and I'd like to know your experience when using both methods regarding the number of results. > > best, > > Juan Carlos Oliveros, Ph.D. > CNB-CSIC, Madrid, Spain > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > Wolfgang Huber > whuber at embl.de > > > >

ADD REPLY • link 14.2 years ago Juan C Oliveros Collazos ▴ 190

Login before adding your answer.