I have an RNASeq dataset containing 200 samples from asthma patients beginning therapy with one of two drugs (drug A and drug B), i.e:
Drug A: 50 samples pre-treatment (baseline) Drug A: 50 samples one week post-treatment
Drug B: 50 samples pre-treatment (baseline) Drug B: 50 samples one week post-treatment
I'm interested in using WGCNA to identify co-expressed gene modules in the pre-treatment baseline samples that correlate with various clinical traits. As part of a differential expression analysis I would then like to carry out gene set testing on these modules to see how they behave at week 1 compared to baseline, i.e. are the genes in the baseline modules upregulated or downregulated at week 1?
I have a question about strategy that I'd like some thoughts on:
Should I pool the drug A and drug B baseline samples and run an ordinary WGCNA analysis. This seems like it would increase power as more samples are going into the WGCNA analysis. Or does it make more sense to identify separate modules in the drug A and drug B groups? This approach, however, seems cleaner, as this involves identifying modules and testing them for differential expression in exactly the same patients.
Any opinions would be appreciated.