Test on correlations among a group of genes
1
0
Entering edit mode
heyi xiao ▴ 360
@heyi-xiao-3308
Last seen 8.2 years ago
United States
Dear list, I have an expression microarray dataset. I would like to compute whether the correlations among a group of genes are significantly higher compared to all genes. What is the proper statistical test to use? Note that the correlation coefficients (a matrix) for the target gene group or the background whole set are not all independent, which makes the test a little trickier. I would appreciate any thoughts/suggestions. Heyi [[alternative HTML version deleted]]
Microarray Microarray • 1.2k views
ADD COMMENT
0
Entering edit mode
Naomi Altman ★ 6.0k
@naomi-altman-380
Last seen 3.6 years ago
United States
Although I think the concept is clear in some special cases, such as all the cross-correlations among genes in 1 set being higher than all the cross-correlations in another, I am not sure you are asking a well defined question. e.g. Set 1: 1 .6 .6 Set 2: 1. .7 .5 .6 1 .6 .5 1 .7 .6 .6 1 .7 .5 1 Which set is more highly correlated? --Naomi At 05:58 PM 2/26/2009, you wrote: >Dear list, > >I have an expression microarray dataset. I would like to compute >whether the correlations among a group of genes are significantly higher >compared to all genes. What is the proper statistical test to use? >Note that the >correlation coefficients (a matrix) for the target gene group or the >background >whole set are not all independent, which makes the test a little >trickier. I would >appreciate any thoughts/suggestions. > > > >Heyi > > > > > > > > [[alternative HTML version deleted]] > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111
ADD COMMENT
0
Entering edit mode
Thanks, Naomi, I am asking 2 things: First, how to compare the cross-correlations among genes in two gene sets of the same size. This includes both senarios you pointed out, both the all-higher-than-all one and not so well-defined one. I want some statistical test that gives a summary p value on the comparison. Second, how significantly correlated the genes in one particular set are relative to all genes. This is a problem related to the first one, in that we can always randomly pick up control sets of the same size up from the whole gene list. Thanks a lot! Heyi --- On Thu, 2/26/09, Naomi Altman <naomi@stat.psu.edu> wrote: From: Naomi Altman <naomi@stat.psu.edu> Subject: Re: [BioC] Test on correlations among a group of genes To: xiaoheyiyh@yahoo.com Cc: "bioconductor@stat.math.ethz.ch" <bioconductor@stat.math.ethz.ch> Date: Thursday, February 26, 2009, 11:38 PM Although I think the concept is clear in some special cases, such as all the cross-correlations among genes in 1 set being higher than all the cross-correlations in another, I am not sure you are asking a well defined question. e.g. Set 1: 1 .6 .6 Set 2: 1. .7 .5 .6 1 .6 .5 1 .7 .6 .6 1 .7 .5 1 Which set is more highly correlated? --Naomi At 05:58 PM 2/26/2009, you wrote: >Dear list, > >I have an expression microarray dataset. I would like to compute >whether the correlations among a group of genes are significantly higher >compared to all genes. What is the proper statistical test to use? >Note that the >correlation coefficients (a matrix) for the target gene group or the >background >whole set are not all independent, which makes the test a little >trickier. I would >appreciate any thoughts/suggestions. > > > >Heyi > > > > > > > > [[alternative HTML version deleted]] > >_______________________________________________ >Bioconductor mailing list >Bioconductor@stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111 [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
hi Heyi, i'd try to look at the empirical cumulative distribution functions of the absolute values of the correlations and test if the difference between the two enclosed areas by these functions are significanly different. i'm not completely sure about what test should you use (maybe somebody else in the list has a clear hint!) but i think the ks.test would do. cheers, robert. On Fri, 2009-02-27 at 07:29 -0800, heyi xiao wrote: > > > > Thanks, Naomi, > > I am asking 2 things: > > First, how to compare the cross-correlations among genes in > two gene sets of the same size. This includes both senarios you pointed out, > both the all-higher-than-all one and not so well-defined one. I want some > statistical test that gives a summary p value on the comparison. > > Second, how significantly correlated the genes in one > particular set are relative to all genes. This is a problem related to the > first one, in that we can always randomly pick up control sets of the same size > up from the whole gene list. > > Thanks a lot! > > Heyi > > > > --- On Thu, 2/26/09, Naomi Altman <naomi at="" stat.psu.edu=""> wrote: > From: Naomi Altman <naomi at="" stat.psu.edu=""> > Subject: Re: [BioC] Test on correlations among a group of genes > To: xiaoheyiyh at yahoo.com > Cc: "bioconductor at stat.math.ethz.ch" <bioconductor at="" stat.math.ethz.ch=""> > Date: Thursday, February 26, 2009, 11:38 PM > > Although I think the concept is clear in some special cases, such as > all the cross-correlations among genes in 1 set being > higher than all the cross-correlations in another, I am not sure you > are asking a well defined question. > > e.g. Set 1: 1 .6 .6 Set > 2: 1. .7 .5 > .6 1 .6 .5 1 .7 > .6 .6 1 .7 .5 1 > > Which set is more highly correlated? > > --Naomi > > > > At 05:58 PM 2/26/2009, you wrote: > > > > > >Dear list, > > > >I have an expression microarray dataset. I would like to compute > >whether the correlations among a group of genes are significantly higher > >compared to all genes. What is the proper statistical test to use? > >Note that the > >correlation coefficients (a matrix) for the target gene group or the > >background > >whole set are not all independent, which makes the test a little > >trickier. I would > >appreciate any thoughts/suggestions. > > > > > > > >Heyi > > > > > > > > > > > > > > > > [[alternative HTML version deleted]] > > > >_______________________________________________ > >Bioconductor mailing list > >Bioconductor at stat.math.ethz.ch > >https://stat.ethz.ch/mailman/listinfo/bioconductor > >Search the archives: > >http://news.gmane.org/gmane.science.biology.informatics.conductor > > Naomi S. Altman 814-865-3791 (voice) > Associate Professor > Dept. of Statistics 814-863-7114 (fax) > Penn State University 814-865-1348 (Statistics) > University Park, PA 16802-2111 > > > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY
0
Entering edit mode
I do not know how to do the test, but I have reservations about using KS. The correlations are correlated. The test statistic for the KS test seems likely to be sensitive to this. --Naomi At 10:19 AM 2/27/2009, Robert Castelo wrote: >hi Heyi, > >i'd try to look at the empirical cumulative distribution functions of >the absolute values of the correlations and test if the difference >between the two enclosed areas by these functions are significanly >different. i'm not completely sure about what test should you use (maybe >somebody else in the list has a clear hint!) but i think the ks.test >would do. > >cheers, >robert. > > > >On Fri, 2009-02-27 at 07:29 -0800, heyi xiao wrote: > > > > > > > > Thanks, Naomi, > > > > I am asking 2 things: > > > > First, how to compare the cross-correlations among genes in > > two gene sets of the same size. This includes both senarios you > pointed out, > > both the all-higher-than-all one and not so well-defined one. I want some > > statistical test that gives a summary p value on the comparison. > > > > Second, how significantly correlated the genes in one > > particular set are relative to all genes. This is a problem related to the > > first one, in that we can always randomly pick up control sets of > the same size > > up from the whole gene list. > > > > Thanks a lot! > > > > Heyi > > > > > > > > --- On Thu, 2/26/09, Naomi Altman <naomi at="" stat.psu.edu=""> wrote: > > From: Naomi Altman <naomi at="" stat.psu.edu=""> > > Subject: Re: [BioC] Test on correlations among a group of genes > > To: xiaoheyiyh at yahoo.com > > Cc: "bioconductor at stat.math.ethz.ch" <bioconductor at="" stat.math.ethz.ch=""> > > Date: Thursday, February 26, 2009, 11:38 PM > > > > Although I think the concept is clear in some special cases, such as > > all the cross-correlations among genes in 1 set being > > higher than all the cross-correlations in another, I am not sure you > > are asking a well defined question. > > > > e.g. Set 1: 1 .6 .6 Set > > 2: 1. .7 .5 > > .6 1 .6 .5 1 .7 > > .6 .6 1 .7 .5 1 > > > > Which set is more highly correlated? > > > > --Naomi > > > > > > > > At 05:58 PM 2/26/2009, you wrote: > > > > > > > > > > >Dear list, > > > > > >I have an expression microarray dataset. I would like to compute > > >whether the correlations among a group of genes are significantly higher > > >compared to all genes. What is the proper statistical test to use? > > >Note that the > > >correlation coefficients (a matrix) for the target gene group or the > > >background > > >whole set are not all independent, which makes the test a little > > >trickier. I would > > >appreciate any thoughts/suggestions. > > > > > > > > > > > >Heyi > > > > > > > > > > > > > > > > > > > > > > > > [[alternative HTML version deleted]] > > > > > >_______________________________________________ > > >Bioconductor mailing list > > >Bioconductor at stat.math.ethz.ch > > >https://stat.ethz.ch/mailman/listinfo/bioconductor > > >Search the archives: > > >http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > Naomi S. Altman 814-865-3791 (voice) > > Associate Professor > > Dept. of Statistics 814-863-7114 (fax) > > Penn State University 814-865-1348 (Statistics) > > University Park, PA 16802-2111 > > > > > > > > > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor at stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111
ADD REPLY
0
Entering edit mode
I think there is a solution based on nonparametric homogeneity test. In short, one needs to calculate all within-set correlations and all between-set correlations. If the percentage of between-set correlations that are higher than average within set correlation is higher than let say 5% hypothesis about the existence of significant differences between within and between set correlations i.e. hypothesis about correlation inhomogeneity could be discarded. The limitation of this approach is the request for a relatively large number of between-set correlations ( > 100) but I do not think that would make any problems in your case. Even better approach would include few replicas of the same microarray experiment. The same problem arose during peptide mapping and the described solution could be found in Debeljak et al. Journal of Chromatography A, 1062 (2005) 79?86. Hope this helps. Zeljko Debeljak, PhD Medical Biochemistry Specialist Osijek Clinical Hospital CROATIA 2009/2/28 Naomi Altman <naomi at="" stat.psu.edu="">: > I do not know how to do the test, but I have reservations about using KS. > The correlations are correlated. ?The test statistic for the KS test seems > likely to be sensitive to this. > > --Naomi > > At 10:19 AM 2/27/2009, Robert Castelo wrote: >> >> hi Heyi, >> >> i'd try to look at the empirical cumulative distribution functions of >> the absolute values of the correlations and test if the difference >> between the two enclosed areas by these functions are significanly >> different. i'm not completely sure about what test should you use (maybe >> somebody else in the list has a clear hint!) but i think the ks.test >> would do. >> >> cheers, >> robert. >> >> >> >> On Fri, 2009-02-27 at 07:29 -0800, heyi xiao wrote: >> > >> > >> > >> > Thanks, Naomi, >> > >> > I am asking 2 things: >> > >> > First, how to compare the cross-correlations among genes in >> > two gene sets of the same size. This includes both senarios you pointed >> > out, >> > both the all-higher-than-all one and not so well-defined one. I want >> > some >> > statistical test that gives a summary p value on the comparison. >> > >> > Second, how significantly correlated the genes in one >> > particular set are relative to all genes. This is a problem related to >> > the >> > first one, in that we can always randomly pick up control sets of the >> > same size >> > up from the whole gene list. >> > >> > Thanks a lot! >> > >> > Heyi >> > >> > >> > >> > --- On Thu, 2/26/09, Naomi Altman <naomi at="" stat.psu.edu=""> wrote: >> > From: Naomi Altman <naomi at="" stat.psu.edu=""> >> > Subject: Re: [BioC] Test on correlations among a group of genes >> > To: xiaoheyiyh at yahoo.com >> > Cc: "bioconductor at stat.math.ethz.ch" <bioconductor at="" stat.math.ethz.ch=""> >> > Date: Thursday, February 26, 2009, 11:38 PM >> > >> > Although I think the concept is clear in some special cases, such as >> > all the cross-correlations among genes in 1 set being >> > higher than all the cross-correlations in another, I am not sure you >> > are asking a well defined question. >> > >> > e.g. ?Set 1: ? ? ?1 ?.6 ?.6 ? ? ? ? ? ? Set >> > 2: ? ?1. ?.7 ?.5 >> > ? ? ? ? ? ? ? ? ? ? ? ?.6 ?1 ?.6 ? ? ? ? ? ? ? ? ? ? ? ? ?.5 ? 1 ? .7 >> > ? ? ? ? ? ? ? ? ? ? ? ?.6 .6 ? 1 ? ? ? ? ? ? ? ? ? ? ? ? ?.7 ?.5 ? 1 >> > >> > Which set is more highly correlated? >> > >> > --Naomi >> > >> > >> > >> > At 05:58 PM 2/26/2009, you wrote: >> > >> > >> > >> > >> > >Dear list, >> > > >> > >I have an expression microarray dataset. I would like to compute >> > >whether the correlations among a group of genes are significantly >> > > higher >> > >compared to all genes. What is the proper statistical test to use? >> > >Note that the >> > >correlation coefficients (a matrix) for the target gene group or the >> > >background >> > >whole set are not all independent, which makes the test a little >> > >trickier. I would >> > >appreciate any thoughts/suggestions. >> > > >> > > >> > > >> > >Heyi >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > ? ? ? ? [[alternative HTML version deleted]] >> > > >> > >_______________________________________________ >> > >Bioconductor mailing list >> > >Bioconductor at stat.math.ethz.ch >> > >https://stat.ethz.ch/mailman/listinfo/bioconductor >> > >Search the archives: >> > >http://news.gmane.org/gmane.science.biology.informatics.conductor >> > >> > Naomi S. Altman ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?814-865-3791 (voice) >> > Associate Professor >> > Dept. of Statistics ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?814-863-7114 (fax) >> > Penn State University ? ? ? ? ? ? ? ? ? ? ? ? 814-865-1348 (Statistics) >> > University Park, PA 16802-2111 >> > >> > >> > >> > >> > >> > ? ? ? [[alternative HTML version deleted]] >> > >> > _______________________________________________ >> > Bioconductor mailing list >> > Bioconductor at stat.math.ethz.ch >> > https://stat.ethz.ch/mailman/listinfo/bioconductor >> > Search the archives: >> > http://news.gmane.org/gmane.science.biology.informatics.conductor >> > >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > Naomi S. Altman ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?814-865-3791 (voice) > Associate Professor > Dept. of Statistics ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?814-863-7114 (fax) > Penn State University ? ? ? ? ? ? ? ? ? ? ? ? 814-865-1348 (Statistics) > University Park, PA 16802-2111 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY

Login before adding your answer.

Traffic: 535 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6