Dear WGCNA Gurus,
What type of affy normalization do you recommend I use for WGCNA input? I found a piece in the literature which favors MAS5 and finds a flaw in GCRMA. I'm not a big fan of MAS5 due to the inherent problems with mismatch probes, so do you think I could safely use RMA? Has anyone made a correction to the GCRMA approach? The paper is titled "Comparative analysis of microarray normalization procedures: effects on reverse engineering gene networks".
What do you think of using Limma run on GCRMA to find differentially expressed probes, while as a separate approach using WGCNA run on RMA to find probes within modules significantly correlated to disease activity and other clinical attributes, and then compare output of the two approaches.
Have any of you worked with custom Affy CDFs such as those from BrainArray? I see the probe deinfitions are more accurate, but they're discarding almost half of the probes on the array for reasons including non-specific binding. Should I press on with the onboard Affy CDF, or can I still get a a good analysis through WGCNA using the BrainArray CDF? http://brainarray.mbni.med.umich.edu/Brainarray/Database/CustomCDF/genomic_curated_CDF.asp
Also, could you please explain the WGCNA intramodular connectivity statistic? I generally understand this as a measurement of a probe's "connectedness" within its module. I thought the largest value it could take on was the total number of probes in that probes module, but I've seen this isn't so. How do I express this as a value relative to the other probes in the module? I don't think I can use a simple ratio.
Best wishes,
Robert
Thanks as always Peter! So what is the theoretical maximum kIM connection strength for a given probe in a module? I'm trying to show this strength relative to probes within other modules, so am trying to scale these somehow. Cheers from Alaska, Rob.
If you want to compare intramodular connectivity between different modules, use kME which is naturally normalized to 1 and the same value means the same correlation with eigengene.
The theoretical connectivity maximum is the number of module genes minus 1. But as I said, this is in practice never even approached and we tend to either leave kIM as is or normalize it to the maximum in the module, i.e. the top hub gene in each module has by definintion scaled kIM equal 1. This makes kIM difficult to compare between modules.
Hi Peter, am I correct in assuming that your softConnectivity() function generates KIM values and signedKME() for KME values? Which do I use to select hub genes? I read WGCNA Hub Gene Selection Method, do you still favor KME?
I'm using:
IMConn = softConnectivity(datExpr);
KME <- signedKME( datExpr, MEs, outputColumnName="KME", corFnc="bicor")
Anything else to know? Am I doing anything erroneously?
Best wishes,
Robert