When you do thousands of correlations with only 4 samples, you can expect a lot of very high correlations just by chance. So, you should filter your genes by some criterion of "interestingness" before performing correlation analysis. --Naomi At 12:02 PM 6/28/2010, Steve Lianoglou wrote: >Hi, > >On Mon, Jun 28, 2010 at 10:53 AM, Yuan Hao <yuan.hao at=""""> wrote: > > Dear List, > > > > I would like to ask if there is such a bioconductor package available that > > can help to achieve the following purpose. Thank you very much in advance! > > > > I got 16 Affy chips corresponding to 4 samples: wild-type treated, > > wide-type untreated, knocked-down treated, and knocked-down untreated, > > i.e. 4 replicates for each sample. > > > > I want to look at the expression correlations between genes. Say, my gene > > of interest is gene X. I would like to find out other genes on the chip > > which have the similar expression profiles with gene X across samples. In > > other words, if expression levels of gene X increased from wild- type > > treated to knocked-out treated, I would like to find all the other genes > > have the same trend. > >Given the size of the bioconductor universe, it's hard to say with any >certainty that a certain function does NOT exist, but I'd be somehow >surprised if this function actually is there, since it's relatively >easy for you to implement yourself. > >You are essentially repeatedly performing a test against each row of >your expression matrix, so think "loops" or some incantation of *apply >methods. > >Here's an easy one. Let's assume: > * `exprs` is a (gene x experiment) matrix with your expression value. > * the value `x` holds the row index of the gene you are interested > >R> set.seed(123) >R> exprs <- matrix(rnorm(100), 5) >R> x <- 1 > >Now you want to test the correlation of the vector @ x with the rest. > >R> cors <- apply(exprs[-x,], 1, cor.test, exprs[x,]) > >This gives you a list of correlation tests that you can (i) get the >statistic out of; and (ii) order > >R> cors.estimate <- sapply(cors, '[[', 'estimate') ## (i) >R> alike <- order(cors.estimate, decreasing=TRUE) ## (ii) > >`alike` now has the indices of genes that are "most + correlated" to >"most - correlated" to gene "x" > >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >If you're a bit more familiar with R functions, you might have known >the there is function named "cor" that creates a correlation matrix >out of matrix. This function works column-wise, so you first have to >transpose your matrix: > >R> all.cors <- cor(t(exprs)) > >R> cors.estimate > cor cor cor cor >-0.01971735 -0.26353249 0.03361119 -0.11578081 > >R> all.cors[1,] >[1] 1.00000000 -0.01971735 -0.26353249 0.03361119 -0.11578081 > >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >The various cluster/heatmap fucntions do correlation based clustering >by default (I believe), which will group your genes row-wise (and >column wise) for you. >Look at ?heatmap and check what that function returns to you in the >"Value" section. > >-steve > >-- >Steve Lianoglou >Graduate Student: Computational Systems Biology > | Memorial Sloan-Kettering Cancer Center > | Weill Medical College of Cornell University >Contact Info: > >_______________________________________________ >Bioconductor mailing list >Bioconductor at > >Search the archives: > Naomi S. Altman 814-865-3791 (voice) Associate Professor Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111
