How to search for coexpression?

0

Entering edit mode

stueber@mpiz-koeln.mpg.de ▴ 10

@stuebermpiz-koelnmpgde-1637

Last seen 9.7 years ago

Dear colleaques, After reading data from affymetrix CEL files I would like to get information about coexpressed and non-coexpressed genes for a chosed set of probesets. I start with: >library(affy) >Data <- ReadAffy() >eset <- rma(Data) This give me a big array of all intensities for all probesets and experiments. But what to do then? Any hints appreciated. Kurt Stueber

• 1.2k views

ADD COMMENT • link updated 18.2 years ago by Björn Usadel ▴ 250 • written 18.2 years ago by stueber@mpiz-koeln.mpg.de ▴ 10

0

Entering edit mode

michael watson IAH-C ★ 3.4k

@michael-watson-iah-c-378

Last seen 9.7 years ago

OK, so your data can be accessed as a data.frame by executing: library(Biobase) exprs(eset) The function cor computes an all against all correlation matrix, but depending on the size of your data set, this may cause a few problems with memory, I don't know. If you do use cor, you will have to transpose your data first as cor computes correlation coefficients on columns, not rows. ?cor ?t You may also need ?as.matrix If you need to convert a data.frame to a matrix ________________________________ From: bioconductor-bounces@stat.math.ethz.ch on behalf of stueber @mpiz-koeln.mpg.de Sent: Tue 07/03/2006 11:09 AM To: bioconductor at stat.math.ethz.ch Cc: stueber at mpiz-koeln.mpg.de Subject: [BioC] How to search for coexpression? Dear colleaques, After reading data from affymetrix CEL files I would like to get information about coexpressed and non-coexpressed genes for a chosed set of probesets. I start with: >library(affy) >Data <- ReadAffy() >eset <- rma(Data) This give me a big array of all intensities for all probesets and experiments. But what to do then? Any hints appreciated. Kurt Stueber _______________________________________________ Bioconductor mailing list Bioconductor at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor

ADD COMMENT • link 18.2 years ago michael watson IAH-C ★ 3.4k

0

Entering edit mode

David Ruau ▴ 110

@david-ruau-1473

Last seen 9.7 years ago

If you already have a subset of chosen probesets you want to look at, you can perform a hierarchical clustering on it to see the probesets that are co-expressed in the same condition (it can tell you some infos if your list is not too big). Just one approach among many others... Take a look at the heatmap function from the simpleaffy package. > library(simpleaffy) > library(amap) > e.eset <- exprs(eset) > heatmap(e.eset, col= blue.white.red.cols, distfun=function(c){Dist(c,method="pearson")}, labCol=colnames(e.eset), Rowv=cov(e.eset), hclustfun=function(c){hclust(c,method='mcquitty')}) David N.B: have also a look at: http://www.bepress.com/bioconductor/ On Mar 7, 2006, at 12:09, stueber at mpiz-koeln.mpg.de wrote: > Dear colleaques, > > After reading data from affymetrix CEL files I would like to > get information about coexpressed and non-coexpressed genes for > a chosed set of probesets. > > I start with: > >> library(affy) >> Data <- ReadAffy() >> eset <- rma(Data) > > This give me a big array of all intensities for > all probesets and experiments. > But what to do then? > > Any hints appreciated. > > Kurt Stueber > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor >

ADD COMMENT • link 18.2 years ago David Ruau ▴ 110

0

Entering edit mode

Björn Usadel ▴ 250

@bjorn-usadel-1492

Last seen 9.7 years ago

Hi Kurt, as Michael Watson already noted, you can use cor. But unless you did significant filtering you are usually not able to calculate an all versus all matrix. so you might want to use tgenes<-t(expressionvalues) as noted and then cor(tgenes[,yourgenelist],tgenes) which will "only" give you the correlation coeffients of your genes against all others. However, setting a good threshold for "coexpression" might be difficult. You might want to experiment with p-values here. (use Fisher's z-transformation) But if you have a huge amount of arrays you will get "significant" p-values, even though the co-expression is minimal. On the other hand if you have very few arrays, you will hardly ever get significance. Also try to use cor(as above, method="spe") which uses spearmans rank correlation. The default pearson is very sensitive to outliers. However, spearman correlation needs more array data. Otherwise you will run into trouble since all your numerical values are transfomed into ranks and with few arrrays only few ranks are possible. Cheers, Bj?rn >Dear colleaques, > >After reading data from affymetrix CEL files I would like to >get information about coexpressed and non-coexpressed genes for >a chosed set of probesets. > >I start with: > > > >>library(affy) >>Data <- ReadAffy() >>eset <- rma(Data) >> >> > >This give me a big array of all intensities for >all probesets and experiments. >But what to do then? > >Any hints appreciated. > >Kurt Stueber > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor > >

ADD COMMENT • link 18.2 years ago Björn Usadel ▴ 250

Login before adding your answer.