Question

WGCNA with large Methylation matrix

0

Entering edit mode

Lorenzo • 0

@db8a7f1c

Last seen 3.9 years ago

Spain

I am trying to use WGCNA with a Methylation data matrix consisting of approximately 150 samples and 384000 probes (beta values).

The command I am using is the following (samples as rows and probes as columns):

bwnet = WGCNA::blockwiseModules(datExpr, 
                                maxBlockSize = 1000, minModuleSize = 30, 
                                power = 6, TOMType = "unsigned", 
                                reassignThreshold = 0, mergeCutHeight = 0.25, 
                                numericLabels = TRUE, 
                                saveTOMs = FALSE, saveTOMFileBase = "methyl", 
                                verbose = 7, nThreads = 18)

I am using an HPC cluster, asking for 20 cores and 140GB of memory. I keep getting the following error:

Calculating module eigengenes block-wise from all genes
   Flagging genes and samples with too many missing values...
    ..step 1
 ....pre-clustering genes to determine blocks..
   Projective K-means:
   ..using 19247 centers.
   ..k-means clustering..
    ..iteration 1
      ..proposing to move 342588 genes..some genes got worse. Trying again.
      ..proposing to move 342046 genes..move accepted.
Killed

It seems that it keeps running out of memory. This is confusing since in the tutorials of this package there is written, multiple times, that with the block-wise approach you can apply it to large datasets with little memory. Am I missing something in my code? Thank you.

R version 4.0.4 (2021-02-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /soft/modules/software/OpenBLAS/0.3.12-GCC-10.2.0/lib/libopenblas_skylakexp-r0.3.12.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] compiler_4.0.4 jsonlite_1.7.3 rlang_0.4.12

wgcna R WGCNA Methylation • 1.1k views

ADD COMMENT • link 4.0 years ago Lorenzo • 0