Question

Finding Modules with WGCNA - Collapsing Biological Replicates in Expression Data - Then Finding Eigengenes on Previously Found Modules

0

Entering edit mode

vsa1111 • 0

@vsa1111-22207

Last seen 4.5 years ago

Hello,

I have RNA-seq expression data from isolated immune cell populations taken from a sizable number of patients (10). Different patients contribute differing numbers of samples. Some patients could have 3 samples taken at 3 different timepoints, while some could only have 1 sample taken at 1 timepoint. In all there are around 25 samples.

The Dilemma: Running WGCNA with mostly default settings gave (at least to me) pretty meaningful results. However, I decided to collapse samples taken from the same patient, so as to not introduce biases towards patients contributing more samples. Running WGCNA on a sample size of now 10 seemed to return spurious results. It produced many more modules (of smaller size) than the previous run before patient collapsing and produced very few genes unassigned to any module. My hypothesis here is that since I am now underpowered, more/stronger associations cannot be made between genes which leads to larger modules. As well, I lose the extra information needed indicate that a given gene may actually not belong in a given module (and therefore might go in the unassigned pile). The reason I would like to run WGCNA on my patient collapsed data is that I do not want to overemphasize module-trait relationships on some patient specific traits (e.g. sex, current smoker, hypertensive, etc).

My Proposal: I would like to see if my proposed alternate pipeline passes sanity checks. I would like to run WGCNA on my full non-collapsed data of 25 samples. This gives me a certain number of groups of genes. I can then collapse my expression data and compute Module Eigengenes from the collapsed data. I have just using the first WGCNA run to create groups (so that I am more powered in making meaningful groups). Then, I am finding Module Eigengenes using collapsed expression data - so for each group I am performing PCA and taking the first component. With these computed Module Eigengenes, I can now relate these to my patient specific traits. Each Eigengene will be of length n = number of patients. The rationale behind this proposed method is as follows: I want to be powered when performing module detection, but I do not want to overemphasize relationships between modules and traits.

Is this proposal a sound approach to the dilemma?

Thank you in advance for any support!

WGCNA Module Detection • 807 views

ADD COMMENT • link updated 3 months ago by dima11i • 0 • written 4.5 years ago by vsa1111 • 0

0

Entering edit mode

Could you explain what you actually do to "collapse" samples from the same patient?

By the way, my main concern on running WGCNA with only 10 samples would not be low power but instead increased probability of spurious correlations!

ADD REPLY • link 4.5 years ago mikhael.manurung ▴ 270

0

Entering edit mode

Load and normalize your gene expression data.Finding Modules with WGCNA - Collapsing Biological Replicates in Expression Data - Then Finding Eigengenes on Previously Found Modules Run 3

ADD REPLY • link 3 months ago dima11i • 0