Question: WGCNA - problems picking a suitable power
0
3.3 years ago by
University of Illinois, Urbana-Champaign
mrodrigues.fernanda10 wrote:

Hi

I am running WGCNA for module detection in my RNA-Seq data and I am having trouble picking a power for my data.

I have 24 samples and I am using a signed network. When I run the pickSoftThreshold function, I see that I would need a power of 30 or 32 for my data (which are the values that reach a scale free topology of 0.9). Is this power too high? Should I use a lower power for module detection?

In the WGCNA FAQ page, I saw that the authors recommend using a power of 18  for signed networks for a sample size between 20 and 30 in case the scale free topology fit index fails to reach values above 0.9 for reasonable powers (less than 15 for unsigned or signed hybrid networks, and less than 30 for signed networks). Is this the case of my data or should I be fine using a power of 30 or 32?

I tried running it with a power of 18 and a power of 30 (using deepSplit = 2 and minModuleSize = 20) but in both cases I have over 4,000 genes in module 0. Is that normal or is that something wrong with my data?

Any help is appreciated. Thank you!

Below are my codes and output for power detection:

powers = c(c(1:10), seq(from = 12, to=40, by=2))

sft1 <- pickSoftThreshold(datExpr0.2FDR, powerVector = powers, networkType ="signed")

Power SFT.R.sq slope truncated.R.sq mean.k. median.k. max.k.
1      1  0.00107  2.32          0.914 4750.00  4750.000 4900.0
2      2  0.18700 -9.38          0.860 2590.00  2570.000 2960.0
3      3  0.41000 -6.90          0.882 1510.00  1470.000 2000.0
4      4  0.55600 -5.08          0.902  923.00   885.000 1430.0
5      5  0.64600 -4.09          0.911  592.00   553.000 1070.0
6      6  0.69300 -3.49          0.910  393.00   358.000  831.0
7      7  0.73200 -3.07          0.912  270.00   237.000  658.0
8      8  0.78600 -2.72          0.932  190.00   161.000  532.0
9      9  0.82200 -2.51          0.942  138.00   112.000  436.0
10    10  0.85500 -2.35          0.955  101.00    79.100  363.0
11    12  0.88700 -2.16          0.965   58.30    41.300  261.0
12    14  0.87900 -2.16          0.960   35.60    22.800  203.0
13    16  0.87700 -2.13          0.960   22.80    13.100  162.0
14    18  0.88500 -2.09          0.969   15.20     7.780  133.0
15    20  0.88600 -2.05          0.970   10.50     4.770  111.0
16    22  0.88700 -2.01          0.973    7.53     3.000   93.8
17    24  0.86900 -2.00          0.962    5.51     1.940   80.5
18    26  0.88000 -1.94          0.968    4.14     1.280   69.8
19    28  0.89000 -1.88          0.970    3.16     0.867   61.2
20    30  0.89700 -1.84          0.970    2.47     0.595   54.1
21    32  0.90500 -1.80          0.974    1.95     0.414   48.1
22    34  0.91100 -1.75          0.975    1.57     0.293   43.1
23    36  0.93100 -1.70          0.982    1.28     0.210   38.8
24    38  0.94000 -1.65          0.983    1.06     0.152   35.2
25    40  0.94900 -1.61          0.988    0.88     0.111   32.0
modified 3.3 years ago by Peter Langfelder1.9k • written 3.3 years ago by mrodrigues.fernanda10
Answer: WGCNA - problems picking a suitable power
0
3.3 years ago by
chris86380
UCL, United Kingdom
chris86380 wrote:

4000 genes seems a bit on the high side for one module, have you corrected for covariates using COMBAT?

If you have two groups with a major difference in the data then this kind of thing can happen I think. It seems sometimes people separate the two groups and run individual WGCNA analysis on both of them, then compare networks.

Answer: WGCNA - problems picking a suitable power
0
3.3 years ago by
United States
Peter Langfelder1.9k wrote:

4000 genes in "module" 0 is nothing unusual - I put the module in quotation marks because the label 0 is reserved for genes not assigned to any module.

I don't think there's anything wrong with your data, you can go with the power 18 (the scale-free topology fit R^2 is 0.88, that's more than enough.

If you haven't done so, please also follow the advice in WGCNA Faq on working with RNA-seq data, especially the variance stabilization or at least log-transform.