Question

WGCNA parallelization multi thread for blockwiseModules and TOMsimilarity

0

Entering edit mode

bioming ▴ 10

@bioming-21835

Last seen 4.0 years ago

Queen's University

Hello,

I'm currently using WGCNA v1.68 to do network analysis on 50k probes. I have a few questions regarding parallelization in WGCNA, particularly running the blockwiseModules and TOMsimilarity functions. I came across another bioconductor question on this topic (https://support.bioconductor.org/p/86147/), but it was from 3 years ago, so I was wondering if there's any updates that I should be aware of?

In the previous questioned, Peter said that blockwiseModules was not parallelized, has this changed? He kindly suggested using an faster BLAS to speed up matrix multiplication in TOM calculations, currently my output when running TOMsimilarity() is showing "..matrix multiplication (system BLAS)..", so I'm guessing the system BLAS is not the fast BLAS Peter's referring to? Does anyone know which fast BLAS I should try installing? (I'm currently using R on a CentOS server with up to 50 cores).

I already tried setting "enableWGCNAThreads(nThreads = 50)", but I don't think it did anything.

Much thanks for any help anyone can provide,

Ming

wgcna parallelization • 2.1k views

ADD COMMENT • link updated 4.4 years ago by Peter Langfelder ★ 3.0k • written 4.4 years ago by bioming ▴ 10

score 1 · Answer 1 · 2019-12-05

I'll try to explain this as best as I can. When calculating TOM from expression data, WGCNA package does some parallelization but this is only performed in correlation calculations, and even those lead to noticeable speedup only when there are many missing values in the expression data which is rare these days. When there are no missing data, the step that is most time-consuming is the matrix multiplication of the adjacency with itself. This is performed in WGCNA by a call to a BLAS routine unless the argument useInternalMatrixAlgebra is TRUE (by default it is FALSE) in which case the matrix multiplication is performed by a slow WGCNA-own routine. I do not recommend this route unless you have a good reason to suspect that your BLAS libraries are buggy.

When WGCNA reports "using system BLAS", it means it uses whatever R was compiled against. You may be able to see that when you run sessionInfo(): mine reports a line

BLAS/LAPACK: /usr/lib64/libopenblasp-r0.3.7.so

signifying my R is compiled against OpenBLAS which is quite fast.

Depending on what system you are on and whether you have administrative privileges, getting R to work with a fast BLAS may be trivial or very complicated. I recommend starting with R installation manual at https://cran.r-project.org/doc/manuals/r-release/R-admin.html, specifically the BLAS section at https://cran.r-project.org/doc/manuals/r-release/R-admin.html#BLAS.