GSEAbase and limma
2
0
Entering edit mode
@javier-perez-florido-3121
Last seen 6.1 years ago
Dear list, I'm new using GSEAbase and I've seen some examples given in "Bioconductor case studies" book. A data example is given according to the following steps: * Nonspecific filtering on expression data object. * Building the GeneSetCollection using KEGG (for example). * Compute the per gene test statistics using t-test * Use of a permutation test to assess which genes have an unusually large absolute value of the distribution. My question is: can we use any kind of statistic? For example, moderated t-statistic using limma?I know that limma uses the eBayes function, which employs information from all genes to arrive at more stable estimates of each individual gene's variance and I don't know if, in GSEA context, it is correct to use this moderated statistic which takes into account all the genes (it is not like the "standard" per gene statistic t-test). Thanks, Javier [[alternative HTML version deleted]]
limma GSEABase limma GSEABase • 1.4k views
ADD COMMENT
0
Entering edit mode
@sunny-srivastava-3793
Last seen 9.6 years ago
Dear Javier, I am pretty sure more experienced member would have a lot and deeper things to say about your question. Here is my 25 cent: Model based statistic (moderated t statistic) and permutation tests are two different flavors of testing the Null Hypothes[es|is]. Comparing these two flavors, in my case, will be equivalent to comparing apple and oranges. Each of these methods have their own advantages. If the model suits well - moderated/unmoderated t - statistic should be preferred. If you have no idea of what the model is OR/AND if you are not sure if the model assumptions hold for the data then - permutation test would be a "wiser" (but not necessarily better) choice. A lot can be said to the above discussion - but permutation test will always exist but might not give superior results to what you model based test statistic would give (t-test is quiet robust to assumptions). This should apply to your example as well. You are allowed to used moderated t statitic Please correct if I am wrong. I am also learning my statistics :-) Thanks and Best Regards, S. 2009/11/23 Javier Pérez Florido <jpflorido@gmail.com> > Dear list, > I'm new using GSEAbase and I've seen some examples given in > "Bioconductor case studies" book. A data example is given according to > the following steps: > > * Nonspecific filtering on expression data object. > * Building the GeneSetCollection using KEGG (for example). > * Compute the per gene test statistics using t-test > * Use of a permutation test to assess which genes have an unusually > large absolute value of the distribution. > > My question is: can we use any kind of statistic? For example, moderated > t-statistic using limma?I know that limma uses the eBayes function, > which employs information from all genes to arrive at more stable > estimates of each individual gene's variance and I don't know if, in > GSEA context, it is correct to use this moderated statistic which takes > into account all the genes (it is not like the "standard" per gene > statistic t-test). > > Thanks, > Javier > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
Dear Sunny, Thanks for your reply regarding the use of parametric/nonparametric statistical tests. What I wanted to mean is the use of a "global" parametric test such limma in the context of Gene Set Enrichment useful for finding biological themes in gene sets. My question is if limma is suitable when building groups of genes since eBayes function employs information from ALL genes, rather than individual genes.... :-) Javier Sunny Srivastava escribi?: > Dear Javier, > I am pretty sure more experienced member would have a lot and deeper > things to say about your question. > > Here is my 25 cent: > Model based statistic (moderated t statistic) and permutation tests > are two different flavors of testing the Null Hypothes[es|is]. > Comparing these two flavors, in my case, will be equivalent to > comparing apple and oranges. > > Each of these methods have their own advantages. If the model suits > well - moderated/unmoderated t - statistic should be preferred. If you > have no idea of what the model is OR/AND if you are not sure if the > model assumptions hold for the data then - permutation test would be a > "wiser" (but not necessarily better) choice. > > A lot can be said to the above discussion - but permutation test will > always exist but might not give superior results to what you model > based test statistic would give (t-test is quiet robust to assumptions). > > This should apply to your example as well. You are allowed to used > moderated t statitic > > Please correct if I am wrong. I am also learning my statistics :-) > > Thanks and Best Regards, > S. > > 2009/11/23 Javier P?rez Florido <jpflorido at="" gmail.com=""> <mailto:jpflorido at="" gmail.com="">> > > Dear list, > I'm new using GSEAbase and I've seen some examples given in > "Bioconductor case studies" book. A data example is given according to > the following steps: > > * Nonspecific filtering on expression data object. > * Building the GeneSetCollection using KEGG (for example). > * Compute the per gene test statistics using t-test > * Use of a permutation test to assess which genes have an unusually > large absolute value of the distribution. > > My question is: can we use any kind of statistic? For example, > moderated > t-statistic using limma?I know that limma uses the eBayes function, > which employs information from all genes to arrive at more stable > estimates of each individual gene's variance and I don't know if, in > GSEA context, it is correct to use this moderated statistic which > takes > into account all the genes (it is not like the "standard" per gene > statistic t-test). > > Thanks, > Javier > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch <mailto:bioconductor at="" stat.math.ethz.ch=""> > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > >
ADD REPLY
0
Entering edit mode
@gordon-smyth
Last seen 55 minutes ago
WEHI, Melbourne, Australia
Hi Javier, It's ok, as long as you repeat the whole eBayes procedure for each permutation. The smoothed standard errors are statistically independent of the moderated t-statistics, hence independent of your category inference. You might also consider the roast() and romer() functions which use the empirical Bayes statistics explicitly. Best wishes Gordon > Date: Tue, 24 Nov 2009 10:16:46 +0100 > From: Javier P?rez Florido <jpflorido at="" gmail.com=""> > Subject: Re: [BioC] GSEAbase and limma > To: Sunny Srivastava <research.baba at="" gmail.com=""> > Cc: bioconductor at stat.math.ethz.ch > Message-ID: <4B0BA47E.8010103 at gmail.com> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > Dear Sunny, > Thanks for your reply regarding the use of parametric/nonparametric > statistical tests. > What I wanted to mean is the use of a "global" parametric test such > limma in the context of Gene Set Enrichment useful for finding > biological themes in gene sets. My question is if limma is suitable when > building groups of genes since eBayes function employs information from > ALL genes, rather than individual genes.... :-) > > Javier > > > Sunny Srivastava escribi?: >> Dear Javier, >> I am pretty sure more experienced member would have a lot and deeper >> things to say about your question. >> >> Here is my 25 cent: >> Model based statistic (moderated t statistic) and permutation tests >> are two different flavors of testing the Null Hypothes[es|is]. >> Comparing these two flavors, in my case, will be equivalent to >> comparing apple and oranges. >> >> Each of these methods have their own advantages. If the model suits >> well - moderated/unmoderated t - statistic should be preferred. If you >> have no idea of what the model is OR/AND if you are not sure if the >> model assumptions hold for the data then - permutation test would be a >> "wiser" (but not necessarily better) choice. >> >> A lot can be said to the above discussion - but permutation test will >> always exist but might not give superior results to what you model >> based test statistic would give (t-test is quiet robust to assumptions). >> >> This should apply to your example as well. You are allowed to used >> moderated t statitic >> >> Please correct if I am wrong. I am also learning my statistics :-) >> >> Thanks and Best Regards, >> S. >> >> 2009/11/23 Javier P?rez Florido <jpflorido at="" gmail.com="">> <mailto:jpflorido at="" gmail.com="">> >> >> Dear list, >> I'm new using GSEAbase and I've seen some examples given in >> "Bioconductor case studies" book. A data example is given according to >> the following steps: >> >> * Nonspecific filtering on expression data object. >> * Building the GeneSetCollection using KEGG (for example). >> * Compute the per gene test statistics using t-test >> * Use of a permutation test to assess which genes have an unusually >> large absolute value of the distribution. >> >> My question is: can we use any kind of statistic? For example, >> moderated >> t-statistic using limma?I know that limma uses the eBayes function, >> which employs information from all genes to arrive at more stable >> estimates of each individual gene's variance and I don't know if, in >> GSEA context, it is correct to use this moderated statistic which >> takes >> into account all the genes (it is not like the "standard" per gene >> statistic t-test). >> >> Thanks, >> Javier
ADD COMMENT

Login before adding your answer.

Traffic: 622 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6