Question

WGCNA normalized CDF to use, WGCNA Intramodular connectivity

0

Entering edit mode

akridgerunner ▴ 30

@akridgerunner-7719

Last seen 7.9 years ago

United States

Dear WGCNA Gurus,

What type of affy normalization do you recommend I use for WGCNA input? I found a piece in the literature which favors MAS5 and finds a flaw in GCRMA. I'm not a big fan of MAS5 due to the inherent problems with mismatch probes, so do you think I could safely use RMA? Has anyone made a correction to the GCRMA approach? The paper is titled "Comparative analysis of microarray normalization procedures: effects on reverse engineering gene networks".

What do you think of using Limma run on GCRMA to find differentially expressed probes, while as a separate approach using WGCNA run on RMA to find probes within modules significantly correlated to disease activity and other clinical attributes, and then compare output of the two approaches.

Have any of you worked with custom Affy CDFs such as those from BrainArray? I see the probe deinfitions are more accurate, but they're discarding almost half of the probes on the array for reasons including non-specific binding. Should I press on with the onboard Affy CDF, or can I still get a a good analysis through WGCNA using the BrainArray CDF? http://brainarray.mbni.med.umich.edu/Brainarray/Database/CustomCDF/genomic_curated_CDF.asp

Also, could you please explain the WGCNA intramodular connectivity statistic? I generally understand this as a measurement of a probe's "connectedness" within its module. I thought the largest value it could take on was the total number of probes in that probes module, but I've seen this isn't so. How do I express this as a value relative to the other probes in the module? I don't think I can use a simple ratio.

Best wishes,

Robert

wgcna affymetrix microarrays affy • 2.6k views

ADD COMMENT • link updated 8.8 years ago by Peter Langfelder ★ 3.0k • written 8.9 years ago by akridgerunner ▴ 30

score 2 · Answer 1 · 2015-06-04

2

Entering edit mode

Peter Langfelder ★ 3.0k

@peter-langfelder-4469

Last seen 28 days ago

United States

People have successfully used WGCNA with both MAS5 and RMA normalized data. If you go the MAS5 route, I recommend log-transforming the data (log-tranformation is built into RMA). I think normalization has more of an effect on regulatory network reconstruction because most reconstruction algorithms use partial correlations which are more sensitive to normalization than the marginal correlations employed by WGCNA.

I am not familiar with the BrainArray CDF so I can't comment on that vs. standard Affy CDF.

Intramodular connectivity has two forms - "standard" intramodular connectivity (kIM) is a sum of connection strengths to genes within a module. This rarely even approaches the number of genes in a module because most connection strengths are far below 1 due to soft thresholding. The second form is module eigengene-based connectivity (kME), which is simply a correlation of the gene expression with the module eigengene. This is always between -1 and 1 but for module genes tends to be highly correlated with "standard" intramodular connectivity. You can rank genes by intramodular connectivity - high kIM or kME genes are called hubs and tend to be important.

Hope this helps,

Peter

ADD COMMENT • link 8.9 years ago Peter Langfelder ★ 3.0k

0

Entering edit mode

Thanks as always Peter! So what is the theoretical maximum kIM connection strength for a given probe in a module? I'm trying to show this strength relative to probes within other modules, so am trying to scale these somehow. Cheers from Alaska, Rob.

ADD REPLY • link 8.9 years ago akridgerunner ▴ 30

0

Entering edit mode

If you want to compare intramodular connectivity between different modules, use kME which is naturally normalized to 1 and the same value means the same correlation with eigengene.

The theoretical connectivity maximum is the number of module genes minus 1. But as I said, this is in practice never even approached and we tend to either leave kIM as is or normalize it to the maximum in the module, i.e. the top hub gene in each module has by definintion scaled kIM equal 1. This makes kIM difficult to compare between modules.

ADD REPLY • link 8.9 years ago Peter Langfelder ★ 3.0k

0

Entering edit mode

Hi Peter, am I correct in assuming that your softConnectivity() function generates KIM values and signedKME() for KME values? Which do I use to select hub genes? I read WGCNA Hub Gene Selection Method, do you still favor KME?

I'm using:

IMConn = softConnectivity(datExpr);

KME <- signedKME( datExpr, MEs, outputColumnName="KME", corFnc="bicor")

Anything else to know? Am I doing anything erroneously?

Best wishes,

Robert

ADD REPLY • link 8.8 years ago akridgerunner ▴ 30

score 0 · Answer 2 · 2015-07-19

0

Entering edit mode

Peter Langfelder ★ 3.0k

@peter-langfelder-4469

Last seen 28 days ago

United States

See this thread:

A: WGCNA Hub Gene Selection Method

ADD COMMENT • link 8.8 years ago Peter Langfelder ★ 3.0k

1

Entering edit mode

Great. How is softconnectivity() different than intramodularConnectivity()? Does softconnectivity calculate adjacency to all other nodes, while intramodular is of course restricted to a probe's "home" module?

ADD REPLY • link 8.8 years ago akridgerunner ▴ 30