WGCNA small number of genes
2
1
Entering edit mode
zson3366 ▴ 10
@zson3366-12250
Last seen 7.1 years ago

I am a user of WGCNA found it really useful for my RNA seq analysis! Currently, because of my research interest, the data I am planning to work on is qPCR data, with much smaller number of genes (from around 30 genes), with relatively big sample size (1000 samples, highly heterogeneous). Apparently I could not get scale free topology, and identifying any big module, which I think is because of the small number of genes. I looked up a lot but could not find anyone asking about using WGCNA on small gene sets. (yes most datasets are now big data..)

Do you think WGCNA  is still applicable for such a small datasets? I am planning to use power = 10 and detect any modules with size 5 genes in the module. But any suggestion on parameter adjustment or the usage of the package in this situation?

I really appreciate your time and any suggestions!!!

WGCNA • 2.2k views
ADD COMMENT
0
Entering edit mode
@lluis-revilla-sancho
Last seen 4 days ago
European Union

I don't think that a network following the scale free topology can be found in 30 genes. Simply there isn't enough features to show the characteristics of such networks. You could use the hard thresholding (it is not explained in the tutorials but is documented well) method.

However, be aware you should correct for that highly heterogeneous samples, the more similar the data is, the better it will reflect the biology behind it with WGCNA. You should consider splitting the dataset by each group/hospital/method... in order to get more homogeneous groups, and then use the consensus method. Note that using multiData structure, requires some changes to build the networks.

ADD COMMENT
0
Entering edit mode
@peter-langfelder-4469
Last seen 9 months ago
United States

Indeed, with a small number of genes you are not likely to construct a scale-free network, but the heterogeneity probably plays a large part as well. I suggest you read through the WGCNA FAQ, especially points 2, 5, and 6 (but other parts of the FAQ can be helpful as well). Use a soft thresholding power from the table in point 6. Doing a consensus analysis is one way to deal with a heterogeneous data set, but there are others as well (point 5). 

On a more general note, for 30 genes I would first focus more on suitable visualization than trying to find modules. A simple heatmap with clustered genes and samples organized either by clustering or by external information (the groups that make the samples so heterogenous) may already tell you a lot and be easier to interpret than WGCNA modules and their eigengenes. Only if that does not provide all the needed information would I try WGCNA.

ADD COMMENT

Login before adding your answer.

Traffic: 711 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6