I am very inexperienced with mathematics and expression data.
I have developed a pipeline for WGCNA, practicing with gene-expression microarray data. I am now determined to try to apply this strategy to microbial-communities count data.
Initially I tried finding an adjacency matrix with the natural count-data:
adjacency(df)
And this indeed produced a set of plots- however certain WGCNA commands won't work such as 'pickSoftThreshold' won't recommend a 'powerEstimate', returning the following Warning repeated many times:
Warning in eval(expr, envir, enclos) :
Some correlations are NA in block 1 : 790 .
Warning in as.vector(log10(dk)) : NaNs produced
So I tried using voom to convert it to the continuous dataset. This works but I am doubtful of voom's output:
voom(df, plot-T)
Contrasting this plot with that of a typical plot from the 'voom' paper (https://genomebiology.biomedcentral.com/articles/10.1186/gb-2014-15-2-r29) indicates that this output is not valid- given that the data is so sparse.
How can I convert such a sparse count data frame to a validly continuous one?
The following post is related to this one but I did not understand most of the terms that were being used: voom for spectral counts