Search
Question: WGCNA error : Error in hclust
1
17 months ago by
lwang.daniel10
lwang.daniel10 wrote:

Hello, I am using WGCNA to construct co-expression networks.

Scripts:

#!/usr/bin/Rscript
getwd()
workingDir = "."
setwd(workingDir)
library(WGCNA)
options(stringsAsFactors = FALSE)
nGenes = ncol(datExpr)
nSamples = nrow(datExpr)
softpower = 6
dissTOM = 1 - TOMsimilarityFromExpr(datExpr, power = softpower)
nSelect = 1500
set.seed(10)
select = sample(nGenes, size = nSelect)
selectTOM = dissTOM[select, select]
selectTree = hclust(as.dist(selectTOM), method = "average")
selectColors = moduleColors[select]
plotDiss = selectTOM^7
diag(plotDiss) = NA
pdf(file = "Plots/networkHeatmap.pdf", width = 15, height = 15)
TOMplot(plotDiss, selectTree, selectColors, main = "Network heatmap plot, selected genes")
dev.off()
save(dissTOM, file = "dissTOM.RData")
MEs = moduleEigengenes(datExpr, moduleColors)\$eigengenes
MET = orderMEs(MEs)
pdf(file = "Plots/hubGeneHeatmap.pdf", width = 6, height = 6)
plotEigengeneNetworks(MET, "Eigengene adjacency heatmap", marHeatmap = c(3,4,2,2), plotDendrograms = FALSE, xLabelsAngle = 90)
dev.off()

And I got a error:

Error in hclust(as.dist(selectTOM), method = "average") :
NaN dissimilarity value.
Calls: hclust -> .Call
Execution halte

I double checked and I am sure there is non-numeric fpkm data in my input file. Does anyone meet this problem and solved it?

Thank you!

modified 17 months ago by Peter Langfelder1.5k • written 17 months ago by lwang.daniel10
0
17 months ago by
United States
Peter Langfelder1.5k wrote:

My guess is that your data contain some non-varying (constant) genes. Try running goodSamplesGenes on datExpr, then check the component allOK of the result. If that is FALSE, you will need to remove genes with zero variance as illustrated in section 1b of Tutorial I of the WGCNA tutorials.

Dear Petter:

Thank you for your patient explanation.

I have filtered data already by using goodSamplesGenes, before co-expression network construction.

Do you have other suggestion?

Thank you!

0
17 months ago by
United States
Peter Langfelder1.5k wrote:

Well, I would run

and check whether there are any missing or NaN values. If there are any, get theirs rows and/or columns, and check the corresponding columns in datExpr (check their variance).

I don't have any other bright ideas, it is hard to debug a problem without the actual data.

Dear Peter:

I did find NA from result of adjacency(datExpr).

ENSG00000184647.10_1 ENSG00000184650.10_2
ENSG00000000003.14_1         7.008855e-05                   NA
ENSG00000000005.5_1          1.045386e-08                   NA

And the expression profile is as follow:

 ENSG00000184650.10_2 0 0 0 0 0 0 0 0 0.018154 0 0 0 0 0 0 0 0 0 0.021175 0 0 0 0 0 0.016796 ENSG00000184647.10_1 0 0 0 0.001065 0 0.008215 0 0 0.006638 0 0 0.006115 0 0.004287 0 0 0 0 0 0 0 0 0 0 0.008637 ENSG00000000003.14_1 8.34744 6.54812 11.7898 4.09474 2.71354 9.61123 9.86517 2.52549 4.12737 6.10589 7.51262 6.97309 10.6391 6.62466 3.45126 7.86053 5.78048 6.40135 33.6326 9.63476 5.70137 3.65546 1.8626 8.34195 8.78094 ENSG00000000005.5_1 0.005077 0 0 0.005777 0 0 0 0 0.011768 0 0 0 0 0 0 0 0 0 0.089062 0 0 0 0 0 0

The variance between ENSG00000184650.10_2 and ENSG00000000003.14_1 are 32.9416. And the variance between ENSG00000184650.10_2 and ENSG00000000005.5_1 are 0.000173. They are so numeric data. Why did adjacency got NA for these pairs? I use adjacency(datExpr, power = 6) for my data.

How can I solve this problem?

Thank you!

0
17 months ago by
United States
Peter Langfelder1.5k wrote:

I assume you have transposed the expression data before you calculate adjacency (WGCNA requires samples in rows and genes in columns).

I took your data, loaded it into R and am getting no missing or NaN results, so your description or code are missing something. Here's the data I have, transposed into genes in columns format.

> dput(data0)
structure(c(0, 0, 0, 0, 0, 0, 0, 0, 0.018154, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0.021175, 0, 0, 0, 0, 0, 0.016796, 0, 0, 0, 0.001065,
0, 0.008215, 0, 0, 0.006638, 0, 0, 0.006115, 0, 0.004287, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0.008637, 8.347438, 6.54812, 11.78981,
4.094741, 2.713539, 9.611229, 9.865166, 2.525487, 4.12737, 6.10589,
7.512624, 6.973087, 10.63914, 6.62466, 3.451262, 7.860533, 5.780477,
6.401347, 33.63262, 9.634759, 5.701372, 3.655462, 1.862599, 8.341952,
8.780938, 0.005077, 0, 0, 0.005777, 0, 0, 0, 0, 0.011768, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0.089062, 0, 0, 0, 0, 0, 0), .Dim = c(25L,
4L), .Dimnames = list(c("V2", "V3", "V4", "V5", "V6", "V7", "V8",
"V9", "V10", "V11", "V12", "V13", "V14", "V15", "V16", "V17",
"V18", "V19", "V20", "V21", "V22", "V23", "V24", "V25", "V26"
), c("ENSG00000184650.10_2 ", "ENSG00000184647.10_1 ", "ENSG00000000003.14_1 ",
"ENSG00000000005.5_1 ")))

ENSG00000184650.10_2  ENSG00000184647.10_1
ENSG00000184650.10_2            1.000000000          7.189324e-03
ENSG00000184647.10_1            0.007189324          1.000000e+00
ENSG00000000003.14_1            0.029619245          5.234850e-10
ENSG00000000005.5_1             0.110284643          4.246166e-08
ENSG00000000003.14_1  ENSG00000000005.5_1
ENSG00000184650.10_2           2.961925e-02         1.102846e-01
ENSG00000184647.10_1           5.234850e-10         4.246166e-08
ENSG00000000003.14_1           1.000000e+00         4.387697e-01
ENSG00000000005.5_1            4.387697e-01         1.000000e+00

Dear Peter:

Thank you for your hint. I do transfor the data by using log2(x+1). Is it not suitable for my data? Cause WGCNA suggests to do. Should I use the row fpkm to analyze the data? And Is there any differences between using raw fpkm and log2(x+1)?

Thank you!

0
17 months ago by
United States
Peter Langfelder1.5k wrote:

I don't see why log2(x+1) should cause any trouble. It certainly does not cause any trouble with the data I use. Inspect the transformed data - are all entries of the 4 genes you sent finite after the transformation?

Dear Peter:

Thank you for your advice. I check the log(x+1) result. And the fpkm of four genes after this transform are all numeric data.

Do you mind check my script and data? I can give it to you, if you do not mind.