Question: converting extremely sparse count dataframe to continuous distributions for study in WGCNA
0
gravatar for chrisclarkson100
3.2 years ago by
chrisclarkson10030 wrote:

I am very inexperienced with mathematics and expression data.

I have developed a pipeline for WGCNA, practicing with gene-expression microarray data. I am now determined to try to apply this strategy to microbial-communities count data. 

Initially I tried finding an adjacency matrix with the natural count-data:

adjacency(df)

And this indeed produced a set of plots- however certain WGCNA commands won't work such as 'pickSoftThreshold' won't recommend a 'powerEstimate', returning the following Warning repeated many times:

Warning in eval(expr, envir, enclos) :
  Some correlations are NA in block 1 : 790 .
Warning in as.vector(log10(dk)) : NaNs produced

So I tried using voom to convert it to the continuous dataset. This works but I am doubtful of voom's output:

voom(df, plot-T)

enter image description here

Contrasting this plot with that of a typical plot from the 'voom' paper (https://genomebiology.biomedcentral.com/articles/10.1186/gb-2014-15-2-r29) indicates that this output is not valid- given that the data is so sparse.

How can I convert such a sparse count data frame to a validly continuous one?

The following post is related to this one but I did not understand most of the terms that were being used: voom for spectral counts

voom counts limma voom • 701 views
ADD COMMENTlink modified 3.2 years ago by Gordon Smyth38k • written 3.2 years ago by chrisclarkson10030
Answer: converting extremely sparse count dataframe to continuous distributions for stud
2
gravatar for Gordon Smyth
3.2 years ago by
Gordon Smyth38k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth38k wrote:

voom is designed for RNA-seq data. There is no reason to think it would work well for microbial counts, and this has little to do with sparseness.

voom is also incompatible with WGCNA, because WGCNA can't use the voom weights (which are the whole point of voom).

If you think that you can adapt voom or other RNA-seq methods to microbial counts, then this is your own statistical research project and your own responsibility. It is not something that the limma authors can advise you on. If you aren't a research statistician, then you might consider starting with what statisticians are already doing for microbial data, for example:

  http://f1000research.com/articles/5-1492/

ADD COMMENTlink modified 3.2 years ago • written 3.2 years ago by Gordon Smyth38k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 329 users visited in the last hour