Question

Steep dendrograms and few modules. WGCNA on RNA seq data

0

Entering edit mode

agustin.gonvi ▴ 20

@agustingonvi-20284

Last seen 2.1 years ago

Cleveland, OH

Hi, I am trying to run WGCNA on RNA seq data and I end up with few modules. I've seen a similar post but unlike those data, mine present a better scale free topology index. I run 3 different databases and always have similar results, I am wondering if I am missing something.

Fitting Index

Here is the code:

adjacency = adjacency(datExpr,
                  type = "signed", 
                  power = 5)


TOM = TOMsimilarity(adjacency,
                TOMType = "signed",
                TOMDenom = "mean",
                suppressTOMForZeroAdjacencies = FALSE,
                verbose = 5)

And the results

The count matrix was created from FASQT files in Galaxy using Bowtie2/HTSeq. Then all samples were normalized using DESeq2 and exported. Further filtering based on counts, row variance and protein coding genes, as well as Log2 transformation were conducted in R. Final number of genes was 14692

Thanks

RNA seq wgcna • 948 views

ADD COMMENT • link updated 4.9 years ago by Peter Langfelder ★ 3.0k • written 4.9 years ago by agustin.gonvi ▴ 20

score 4 · Accepted Answer · 2019-06-03

4

Entering edit mode

Peter Langfelder ★ 3.0k

@peter-langfelder-4469

Last seen 29 days ago

United States

The only obvious thing is that the power 5 is quite low for a "signed" network. I would raise it to say 10 or use 'signed hybrid" for network type. However, that may not help you get more modules. You should plot a sample clustering tree and/or PCA plot to make sure you don't have a few large sample clusters; if you do, investigate whether they are biologically plausible/interesting or whether they are likely to be technical, and run adjustment if the clusters aren't driven by a variable of interest.

ADD COMMENT • link 4.9 years ago Peter Langfelder ★ 3.0k

0

Entering edit mode

Thanks! I've noticed that 3 RNA seq databases I am working with reach a fitting index of 0.8 between 4 and 6 stp. With micro-arrays, I was using numbers stp 9 and 11. That could be the problem. Is there a mean connectivity I should be shooting for? I was working under the assumption that the largest the better based on micro-array data which was always in the lower end, but that may not be true with these new data presenting stronger connections.

ADD REPLY • link 4.9 years ago agustin.gonvi ▴ 20

1

Entering edit mode

I aim for mean connectivity around 100 or less (typically between 30-50); median connectivity usually ends up around 10.

ADD REPLY • link 4.9 years ago Peter Langfelder ★ 3.0k