Search
Question: WGCNA small number of genes
1
gravatar for zson3366
22 months ago by
zson336610
zson336610 wrote:

I am a user of WGCNA found it really useful for my RNA seq analysis! Currently, because of my research interest, the data I am planning to work on is qPCR data, with much smaller number of genes (from around 30 genes), with relatively big sample size (1000 samples, highly heterogeneous). Apparently I could not get scale free topology, and identifying any big module, which I think is because of the small number of genes. I looked up a lot but could not find anyone asking about using WGCNA on small gene sets. (yes most datasets are now big data..)

Do you think WGCNA  is still applicable for such a small datasets? I am planning to use power = 10 and detect any modules with size 5 genes in the module. But any suggestion on parameter adjustment or the usage of the package in this situation?

I really appreciate your time and any suggestions!!!

ADD COMMENTlink modified 21 months ago by Peter Langfelder1.6k • written 22 months ago by zson336610
0
gravatar for Lluís Revilla Sancho
21 months ago by
European Union
Lluís Revilla Sancho430 wrote:

I don't think that a network following the scale free topology can be found in 30 genes. Simply there isn't enough features to show the characteristics of such networks. You could use the hard thresholding (it is not explained in the tutorials but is documented well) method.

However, be aware you should correct for that highly heterogeneous samples, the more similar the data is, the better it will reflect the biology behind it with WGCNA. You should consider splitting the dataset by each group/hospital/method... in order to get more homogeneous groups, and then use the consensus method. Note that using multiData structure, requires some changes to build the networks.

ADD COMMENTlink written 21 months ago by Lluís Revilla Sancho430
0
gravatar for Peter Langfelder
21 months ago by
United States
Peter Langfelder1.6k wrote:

Indeed, with a small number of genes you are not likely to construct a scale-free network, but the heterogeneity probably plays a large part as well. I suggest you read through the WGCNA FAQ, especially points 2, 5, and 6 (but other parts of the FAQ can be helpful as well). Use a soft thresholding power from the table in point 6. Doing a consensus analysis is one way to deal with a heterogeneous data set, but there are others as well (point 5). 

On a more general note, for 30 genes I would first focus more on suitable visualization than trying to find modules. A simple heatmap with clustered genes and samples organized either by clustering or by external information (the groups that make the samples so heterogenous) may already tell you a lot and be easier to interpret than WGCNA modules and their eigengenes. Only if that does not provide all the needed information would I try WGCNA.

ADD COMMENTlink written 21 months ago by Peter Langfelder1.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 364 users visited in the last hour