Question

Gene Co-Expression Network using Pearsons Correlation

0

Entering edit mode

web.surf.profile • 0

@websurfprofile-23321

Last seen 4.0 years ago

HI ,

I am a noob statistician and I am trying to build a gene co-expression network using pearson's correlation as a distance metric in my data. As I was doing this I caculated the correlation matrix which has values from [-1,1] and then squared this to make it [0,1] and then applied a threshold of 0.7 (totally arbitrary) and built a network.

When I was researching, I came across the package "coexnet" which does this in one-step. However when i compared the results they are varying. When I tried to look into the source code - I found this line.

simil <- abs(cor(t(difexp),use =  "pairwise.complete.obs")) # line nbr 82 in the internalFunctions.R file

The author seems to have taken an absolute value.

What is the advice in this instance? Do we take the abs or square it? To me the negative values of r matrix means they are dissimilar which is now low when we do an abs.

Thanks in advance!

coexnet • 559 views

ADD COMMENT • link updated 4.0 years ago by Kevin Blighe ★ 3.9k • written 4.0 years ago by web.surf.profile • 0

score 0 · Answer 1 · 2020-04-13

This relates to whether or not the network will be signed or unsigned, and the correlation values will be set as edge weights. If you take the absolute correlation values, then the network will be unsigned and you lose some information on directionality [of the correlation]. However, having a signed network allows you to both infer 'weight' [of the correlation] and also directionality. You need to decide which is best for your own study.

For what it's worth, I developed my own tutorial where I deal with this issue by simply colouring the edges blue (for negative / inverse correlations) and red (for positive correlations): Tutorial: Network plot from expression data in R using igraph.

Kevin