WGCNA "bicor" correlation
0
0
Entering edit mode
bcbio_uk ▴ 10
@bcbio_uk-6970
Last seen 5.0 years ago
United Kingdom

Hello,

I have a question regarding the appropriate correlation to use for smaller sample sizes when creating an adjacency matrix in WGCNA. I have 8 samples in my condition, and although these are not ideal numbers for WGCNA, I would like to give this a try. I was recommended using the bicor correlation instead of the default Pearson correlation, as bicor uses median values instead of using mean values when calculating the stat. This makes it more robust for smaller sample sizes.

Question 1. Will bicor correlation indeed be better for a sample size of 8?

Question 2. I'm not entirely sure how to implement this in WGCNA when creating the adjacency matrix. I've tried to add the corFnc = "bicor" argument in the adjacency() function. Please see code:

adjacency_matrix = adjacency((data), power = 12, type = "signed", corFnc = "bicor")
dissTOM = 1-TOMsimilarity(adjacency_matrix, TOMType = "signed")
geneTree = flashClust(as.dist(dissTOM), method = "average")

I get the following warning when I implement the adjacency() function with "bicor":

Warning message:
In bicor(datExpr, use = "p") :
bicor: zero MAD in variable 'x'. Pearson correlation was used for individual columns with MAD=NA.

I would appreciate any help.

Thank you.

wgcna bicor WGCNA • 3.0k views
0
Entering edit mode

It might be too late but did you check the data with goodSamplesGenes function?

0
Entering edit mode

I got the same warning even after all my genes passed the goodSamplesGenes() check. Were you able to solve this problem?

2
Entering edit mode

I would not worry about it too much, it is not necessarily a problem. MAD is median absolute deviation, i.e., the median of absolute values of differences between observations and the observation median. It means a gene is constant in most samples (it's enough for the gene to be constant in half plus one observations for MAD to be zero). goodSamplesGenes and friends check for (a) zero variance and (2) excessive counts of missing data. These functions do not check for zero MAD.