Question: WGCNA: Understanding module-trait correlations
1
gravatar for bekah
8 months ago by
bekah20
bekah20 wrote:

Hiya,
 

Was just wanting to clarify my understanding of the WGCNA output as I have been reading various articles and have gotten confused- with the module-trait heatmap, if there is a positive correlation this means all the genes in the module have higher-expression when associated with the trait? so say if treated (1) and untreated (0), the genes have a higher expression when group has been treated than when untreated? i.e. as trait becomes greater (0 to 1), the expression becomes greater?

Best wishes,

Rebekah

wgcna package • 1.3k views
ADD COMMENTlink modified 8 months ago by Peter Langfelder1.8k • written 8 months ago by bekah20
Answer: WGCNA: Understanding module-trait correlations
1
gravatar for Peter Langfelder
8 months ago by
United States
Peter Langfelder1.8k wrote:

Your question is not very clear but I'll try anyway. The module-trait heatmap usually represents the correlations of the module eigengenes with traits. When that correlation is high, it means the eigengene increases with increasing trait. In a signed network (where all genes in a module are positively correlated with the eigengene) it will mean that (again if the eigengene-trait correlation is high) pretty much all genes should also follow the same pattern of increasing expression with increasing trait values. In an unsigned network you may also have genes that have the opposite behaviour since in an unsigned network a module can contain also genes strongly negatively correlated with the eigengene.

Hope this helps.

ADD COMMENTlink written 8 months ago by Peter Langfelder1.8k

Hi,

Cheers for the very rapid response! Yes it is a signed network. I was trying to work out, does the correlation relate to the fold changes between treated and untreated i.e. if positive correlation in module-trait heatmap = all genes are upregulated (higher expression) in that module when treatment is applied vs no-treatment.  And is better then to use an unsigned network to see whether there are links between genes that have higher and lower expression when treatment is applied?

Best wishes,

Rebekah

ADD REPLYlink written 8 months ago by bekah20

Sorry wrong reply on wrong message - I meant to say, that I think I misunderstood. So the correlation is to do with the strength of the correlations between the nodes within a network. So negative correlation means with increasing trait value, the correlation between expression levels within the network weakens?

ADD REPLYlink modified 8 months ago • written 8 months ago by bekah20

There are two different correlations (or sets of correlations) that you need to distinguish. The eigengene-trait correlation measures the strength and direction of association between the module (more precisely, the representative profile) and the trait. If this is positive (negative), it means the trait increases (decreases) with increasing eigengene "expression".

If this correlation is strong and the network is signed (or signed hybrid)  it means that most of the genes in the module will also exhibit a correlation with the trait of the same sign as the eigengene. In an unsigned network, the gene-trait correlations can have the same or opposite sign.

Correlations among genes in a module are usually independent of whether the eigengene is correlated with a trait or not (and whether the correlation is positive or negative).

HTH,

Peter

ADD REPLYlink written 8 months ago by Peter Langfelder1.8k

Apologies for the late reply. We recommend using a signed or signed hybrid network analysis. You can read some more here:

http://www.peterlangfelder.com/signed-or-unsigned-which-network-type-is-preferable/ and

http://www.peterlangfelder.com/two-types-of-signed-networks-in-wgcna/

Peter

ADD REPLYlink written 8 months ago by Peter Langfelder1.8k

Sorry for the delay, I've been trying to get my head around the unsigned, signed and hybrid analysis.
So - unsigned - negative and positive correlations between expression values of the genes are all treated equally whether negative or positive?
- signed - negative correlations between expression values of the genes are assigned an adjacency still, but it is so small its negligible and therefore only positive correlations are really accounted for in the network output
- hybrid signed - only positive correlations between expression values of the genes are accounted for and all negative correlations are set to zero?

So in expression data where you are only interested in when expression on one gene increases with expression level of another you would use a signed network.
Then if a trait, e.g. length is negatively correlated with a module, and increase in the length would be associated with lower expression of all the genes within that module?

In data where negative correlation in expression between genes, e.g. homeostatic regulation? you should use unsigned.
Then if a trait, e.g. length is negatively correlated with a module, an increase in the length could be associated with higher or lower expression of genes, as the sign of the correlation of the gene with other gene expression within the module could be positive or negative?

 

ADD REPLYlink written 8 months ago by bekah20

You got it exactly right. I would only add that instead of running an unsigned analysis, I personally prefer to run a signed or signed hybrid, and then look at whether anti-correlated modules could be thought of as part of a single biological pathway/process. In my work, negatively correlated modules are usually biologically quite different.

ADD REPLYlink written 8 months ago by Peter Langfelder1.8k

cheers! :) took me a while. but glad I got there! 

ADD REPLYlink written 8 months ago by bekah20

Sorry another quick question - in the tutorial document, https://labs.genetics.ucla.edu/horvath/CoexpressionNetwork/Rpackages/WGCNA/Tutorials/FemaleLiver-02-networkConstr-man.pdf

the only place I can see signed specified is on the axis label of the soft threshold graph? Does this mean that the default is signed or should it be specified within the script i,e:

sft = pickSoftThreshold(datExpr, powerVector = powers, verbose = 5, networkType = "signed")

and then when calculating adjacency and TOM

adjacency = adjacency(datExpr, power = softPower, type="signed");

TOM = TOMsimilarity(adjacency, TOMType = "signed");

And where you have to specify the correlation, should this be done under TOM? 

I have currently only specified spearman here as my variables are binary categorical:
"moduleTraitCor = cor(MEs, datTraits, method='spearman', use = "p");" 

Should I also have specified spearmans under TOM too?

 

or should I have used the corType ="bicor" here when running blockwisemodules because spearman is not an option?
net = blockwiseModules(datExpr, power = 16, TOMType ="signed", type="signed",minModuleSize = 30,maxBlockSize=30000, corType="bicor", maxPOutliers = 0.1, reassignThreshold = 0, mergeCutHeight = 0.25,numericLabels = TRUE, pamRespectsDendro = FALSE,verbose = 3)

ADD REPLYlink modified 8 months ago • written 8 months ago by bekah20

The default in WGCNA is, for historical reasons, unsigned network (at least for most functions). You need to check the help file for each function you want to use; most functions where network type matters will have an argument networkType or just type.

TOM type is not related to network type and only makes any difference for unsigned networks. For more details, please read this discussion: https://labs.genetics.ucla.edu/horvath/CoexpressionNetwork/Rpackages/WGCNA/TechnicalReports/signedTOM.pdf

I would avoid using Spearman correlation unless there is a good reason to use it; in my experience, it leads to somewhat inferior results. TOM calculation directly from expression data (TOMsimilarityFromExpr) can only use Pearson or biweight midcorrelation; you have to calculate adjacency first (using Spearman correlation) and then turn it into TOM using TOMsimilarity().

Having binary traits is by itself not a reason to use Spearman correlation, you can use Pearson correlation or (with some care) the robust biweight midcorrelation.

 

ADD REPLYlink written 8 months ago by Peter Langfelder1.8k

Thank you for your response, I have just tried the bicor within the pick soft threshold and my soft threshold does not reach 0.85, so i will stick to pearsons - the reason I swapped to bicor was from the FAQ section of WGCNA recommending its use and against pearsons due to outlier sensitivity - but pearsons is okay to use then?

If using bicor - should the MaxPOutliers=0.05, and RobustY=FALSE only be used when correlating the module eigengene to the binary categorical trait? or should these be used when creating the modules too?

If a signed network has been used and some genes have a negative module membership - does this mean that they are negatively correlated with the eigengene (PC1) but are positively correlated with the genes to which their expression is correlated?

Apologies for all the questions, I'm just trying really hard to understand the theory behind what I am running in R.

Best wishes 

R

ADD REPLYlink modified 8 months ago • written 8 months ago by bekah20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 111 users visited in the last hour