How to search for coexpression?
3
0
Entering edit mode
@stuebermpiz-koelnmpgde-1637
Last seen 10.3 years ago
Dear colleaques, After reading data from affymetrix CEL files I would like to get information about coexpressed and non-coexpressed genes for a chosed set of probesets. I start with: >library(affy) >Data <- ReadAffy() >eset <- rma(Data) This give me a big array of all intensities for all probesets and experiments. But what to do then? Any hints appreciated. Kurt Stueber
• 1.3k views
ADD COMMENT
0
Entering edit mode
@michael-watson-iah-c-378
Last seen 10.3 years ago
OK, so your data can be accessed as a data.frame by executing: library(Biobase) exprs(eset) The function cor computes an all against all correlation matrix, but depending on the size of your data set, this may cause a few problems with memory, I don't know. If you do use cor, you will have to transpose your data first as cor computes correlation coefficients on columns, not rows. ?cor ?t You may also need ?as.matrix If you need to convert a data.frame to a matrix ________________________________ From: bioconductor-bounces@stat.math.ethz.ch on behalf of stueber @mpiz-koeln.mpg.de Sent: Tue 07/03/2006 11:09 AM To: bioconductor at stat.math.ethz.ch Cc: stueber at mpiz-koeln.mpg.de Subject: [BioC] How to search for coexpression? Dear colleaques, After reading data from affymetrix CEL files I would like to get information about coexpressed and non-coexpressed genes for a chosed set of probesets. I start with: >library(affy) >Data <- ReadAffy() >eset <- rma(Data) This give me a big array of all intensities for all probesets and experiments. But what to do then? Any hints appreciated. Kurt Stueber _______________________________________________ Bioconductor mailing list Bioconductor at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor
ADD COMMENT
0
Entering edit mode
David Ruau ▴ 110
@david-ruau-1473
Last seen 10.3 years ago
If you already have a subset of chosen probesets you want to look at, you can perform a hierarchical clustering on it to see the probesets that are co-expressed in the same condition (it can tell you some infos if your list is not too big). Just one approach among many others... Take a look at the heatmap function from the simpleaffy package. > library(simpleaffy) > library(amap) > e.eset <- exprs(eset) > heatmap(e.eset, col= blue.white.red.cols, distfun=function(c){Dist(c,method="pearson")}, labCol=colnames(e.eset), Rowv=cov(e.eset), hclustfun=function(c){hclust(c,method='mcquitty')}) David N.B: have also a look at: http://www.bepress.com/bioconductor/ On Mar 7, 2006, at 12:09, stueber at mpiz-koeln.mpg.de wrote: > Dear colleaques, > > After reading data from affymetrix CEL files I would like to > get information about coexpressed and non-coexpressed genes for > a chosed set of probesets. > > I start with: > >> library(affy) >> Data <- ReadAffy() >> eset <- rma(Data) > > This give me a big array of all intensities for > all probesets and experiments. > But what to do then? > > Any hints appreciated. > > Kurt Stueber > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor >
ADD COMMENT
0
Entering edit mode
Björn Usadel ▴ 250
@bjorn-usadel-1492
Last seen 10.3 years ago
Hi Kurt, as Michael Watson already noted, you can use cor. But unless you did significant filtering you are usually not able to calculate an all versus all matrix. so you might want to use tgenes<-t(expressionvalues) as noted and then cor(tgenes[,yourgenelist],tgenes) which will "only" give you the correlation coeffients of your genes against all others. However, setting a good threshold for "coexpression" might be difficult. You might want to experiment with p-values here. (use Fisher's z-transformation) But if you have a huge amount of arrays you will get "significant" p-values, even though the co-expression is minimal. On the other hand if you have very few arrays, you will hardly ever get significance. Also try to use cor(as above, method="spe") which uses spearmans rank correlation. The default pearson is very sensitive to outliers. However, spearman correlation needs more array data. Otherwise you will run into trouble since all your numerical values are transfomed into ranks and with few arrrays only few ranks are possible. Cheers, Bj?rn >Dear colleaques, > >After reading data from affymetrix CEL files I would like to >get information about coexpressed and non-coexpressed genes for >a chosed set of probesets. > >I start with: > > > >>library(affy) >>Data <- ReadAffy() >>eset <- rma(Data) >> >> > >This give me a big array of all intensities for >all probesets and experiments. >But what to do then? > >Any hints appreciated. > >Kurt Stueber > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor > >
ADD COMMENT

Login before adding your answer.

Traffic: 636 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6