Question: WGCNA sample size minimum: why?
2
gravatar for charles.foster
4 weeks ago by
charles.foster30 wrote:

Hi all,

While looking into WGCNA analysis, I saw that the minimum recommended sample size is 15 samples because:

correlations on fewer than 15 samples will simply be too noisy for the network to be biologically meaningful

I'm wondering if anyone here would be able to further clarify why this is the case. In my attempts to understand this question, I've thought of two possibilities that partially overlap...

(1) Having <15 samples invalidates conclusions because results might be spuriously driven by one or a couple of replicates

(2) Having ≥15 samples is suggested because a smaller number might not have enough power to detect any biological trends, i.e. module eigengenes won't have any underlying biological significance

Are either, or both, of these thoughts correct?

Out of interest, I ran a WGCNA analysis on a data set of 12 samples. I first recovered module eigengenes, and then correlated these with three binary traits. The results are entirely reasonable, with the most highly correlated module for each trait telling an interesting biological story that reflects standard differential expression + GO enrichment analysis. I should note that the data are heterogeneous, with differences in expression between samples wholly reflecting the traits of interest (as per point 5 here: https://horvath.genetics.ucla.edu/html/CoexpressionNetwork/Rpackages/WGCNA/faq.html). I'm guessing that this might be why results appear to be sensible: the data are highly informative without noise swamping the biological signals that we are interested in. In this case, I'd be inclined to say that the rule of thumb of 15 samples might not matter?

Any comments would be appreciated :)

– C

wgcna • 104 views
ADD COMMENTlink modified 4 weeks ago by Peter Langfelder2.1k • written 4 weeks ago by charles.foster30
Answer: WGCNA sample size minimum: why?
2
gravatar for Peter Langfelder
4 weeks ago by
United States
Peter Langfelder2.1k wrote:

If you have strong signal and clean data, yes, WGCNA could be informative even with 12 samples. The worst that can happen (with few samples, strong signal and fairly clean data) is that WGCNA won't give you insights that you could not gain from a plain DE analysis. Many of the finer-grain results of WGCNA (e.g., picking hub genes) become less reliable with fewer samples, and 15 seems like a good number to draw a generic line. Sometimes you can get decent results with 10 samples, and sometimes a 20-sample (or bigger) WGCNA won't provide any good insights.

ADD COMMENTlink written 4 weeks ago by Peter Langfelder2.1k

Great, thanks for the fast clarification!

ADD REPLYlink written 4 weeks ago by charles.foster30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 337 users visited in the last hour