Hello group!
While I am trying WGCNA package to get modules of the candidate genes without expression data but distance matrix derived from sequence alignment. The distance matrix was converted from newick file with python script.
Asking Dr Steve Horvath about this situation I was told:
The WGCNA package does not require gene expression data. Rather many functions directly apply to an adjacency matrix (or conversely a distance matrix). e.g. dissTOM, networkConcepts, flashClust.
Regarding your question: Intramodular hub genes are equivalent to module eigengenes (as shown in Horvath Dong 2008)
Thus, you can simply represent a module by the most highly connected intramodular node.
Toward this end, you can use the following function from the link
> intramodularConnectivity(adjMat, colors, scaleByMax = FALSE)
but I am still not clear about the next steps following the tutorial stuck from this step:
# MEList = moduleEigengenes(datExpr, colors = dynamicColors)
> MEList <- intramodularConnectivity(distMatrix, colors, scaleByMax = FALSE)
> MEs = MEList$eigengenes
# Calculate dissimilarity of module eigengenes
> MEDiss <- 1-cor(MEs);
# Cluster module eigengenes
> METree <- hclust(as.dist(MEDiss), method = "average");
> MEDissThres = 0.25
# Plot the cut line into the dendrogram
> abline(h=MEDissThres, col = "red")
# Call an automatic merging function
> merge = mergeCloseModules(distMatrix, dynamicColors, cutHeight = MEDissThres, verbose = 3)
> mergedColors = merge$colors;
# Eigengenes of the new merged modules:
> mergedMEs = merge$newMEs;
> plotDendroAndColors(geneTree, cbind(dynamicColors, mergedColors), c("Dynamic Tree Cut", "Merged dynamic"), dendroLabels = FALSE, hang = 0.03, addGuide = TRUE, guideHang = 0.05)
> # dev.off()
To be more specific, my questions are:
1) How do I get adjacency matrix from the distance matrix for intramodularConnectivity() function as, again, I do not have expression data;
2) If distance matrix can be used for adjacency matrix, how do I get "eigengenes" from intramodularConnectivity() function which output 4 column of a dataframe that does not have "eigengenes" and feed to the next steps?
3) How to handle another step with mergeCloseModules() which needs datExpr object but I have distance Matrix?
I am aware that the whole issue is to handle distance matrix as input instead of expression data. Appreciate if anyone has experience with similar scenario, which I thought this might be useful for phylogenetic study.
Thanks a lot!
Yifang
Thanks Peter!
You clarified many of my questions.
First, about the trait, I have an idea to use the gene functions as trait, say, different group of transcription factors, as discrete input (without replicate etc) and, I am not sure about the adequacy at this moment either. But, this is not my priority. I am in more need on the techniques to get the intramodular hub genes.
I have tried:
to categorize each gene to cluster groups. However, this simple way seems not self-learning to get the modules as of regular WGCNA.
1) How to get the hub genes?
2) How to match the resulted cluster group (from cutreeDynamic) with the hub genes, if this is appropriate?
Thanks!
Hello:
I am still working on my distance matrix dataset (from clustalw alignment and tree creation) to get it run.
The error is clear that my distance matrix is not between -1 and 1. Spent hours searching for conversion between distance matrix to similarity matrix, could not get a clear answer but more confusion. Unfortunately, no example similar to my case as input data is distance matrix only. Not sure I am doing the right way.
Appreciate any input to help me get out the problem. Thanks a lot!
Yifang
The simplest way of turning a distance matrix into a similarity consists of two steps. First, scale the distances to lie between 0 and 1, e.g. by dividing the distances by their maximum or by taking tanh() of the distance matrix (perhaps scaled by some characteristic distance, e.g. mean or 2*mean). Any number of other transformations are also possible. Second, you take 1-scaled distance as similarity. The 1-dist is necessary since similarity is supposed to be near 1 for similar (close) objects, whereas distance of such objects is close to 0.
HTH,
Peter