Hello, I am running WGCNA with 8 datasets and trying to do consensus analysis. When working through the WGCNA tutorials and looking at my sample dendrograms it seems that there may be an outlier or two in my sets.
I am extremely lost with creating cut heights for these graphs, especially when the cut height has to be different for different plots.
In the tutorial I do not understand the function of cutHeights = c(16, 16*exprSize$nSamples[2]/exprSize$nSamples[1]);
Why can we not create a vector of cut heights for each graph? Furthermore when I do try to create a cut height my sample sizes always end up as zero at the end.
> baseHeight = 59
> # Adjust the cut height for the male data set for the number of samples
> cutHeights = c(59, 59, 59,75, 59,59, 59, 59);
> # Re-plot the dendrograms including the cut lines
> pdf(file = "New_Plots_SampleClustering.pdf", width = 12, height = 12);
> par(mfrow=c(2,1))
> par(mar = c(0, 4, 2, 0))
> for (set in 1:nSets)
+ {
+ plot(sampleTrees[[set]], main = paste("Sample clustering on all genes in", setLabels[set]),
+ xlab="", sub="", cex = 0.7);
+ abline(h=cutHeights[set], col = "red");
+ }
> dev.off();
>
> for (set in 1:nSets)
+ {
+ labels = cutreeStatic(sampleTrees[[set]], cutHeight = cutHeights[set])
+ # Keep the largest one (labeled by the number 1)
+ keep = (labels==1)
+ multiExpr[[set]]$data = multiExpr[[set]]$data[keep, ]
+ }
>
> collectGarbage();
> # Check the size of the leftover data
> exprSize = checkSets(multiExpr)
> exprSize
$nSets
[1] 8
$nGenes
[1] 18736
$nSamples
[1] 0 0 0 0 0 0 0 0
$structureOK
[1] TRUE
Hi, WGCNA is not a Bioconductor package. If the question you have is not addressed in WGCNA tutorial material, please contact the authors of the package directly.