Dear Bioconductor users,
I am working with TCGA RNA-seq data. I performed WGCNA analysis using vst transformed data and I found 13 network modules. Now I want to perform module preservation analysis using as reference and test sets the same data in order to check the robustness of module definition. The sizes of the identified modules are: 82, 88, 89, 114, 177, 177, 915, 1064, 1162, 1272, 1692, 2045, 3248. So, I have read in the paper titled "Is My Network Module Preserved and Reproducible?" that the Z statistics and permutation test p-values often depend on the module size (i.e. the number of nodes in a module). This fact reflects the intuition that it is more significant to observe that the connectivity patterns among nodes of huge modules are preserved than to observe the same among nodes of small modules. Also I beleive that the use of maxModuleSize= 1000 (the default option) of modulePreservation function will bias the following analysis as the coexpression network consists of large sized modules and they will be reduced randomly. Is it true to use the maxModuleSize=3248 and primarly focus on medianRank composite statistic? Additionally, i have not understood the role of gold module in modulePresevation analysis. Should I also change the maxGoldModuleSize from the default value (1000)?
I would appreciate hearing your opinion!!!
Thank you very very much for your time in advance!!!
Sincerely,
Panagiotis Mokos