Hi, I'm running a WGCNA analysis to detect correlating genes between a Virus and a host. My issue here is, that, over the course of the infection, the relative amount of host transcripts is decreasing from 100% to around 20%, so by normalizing over host and virus transcripts together, all finer differential dynamics are overshadowed by this trend. For a DE-analysis with DESeq2, I run the whole analysis on host and virus transcripts separately, which works fine.
An important property of the virus is, that all its transcripts (around 200) are highly regulated with regard to the time points and do not fit scale free topology, but clustering from WGCNA shows to be biologically meaningful.
Analyzing the host transcripts using the LRT-test from DESeq2 (model: ~time_point, reduced model: 1) shows, that about half of the transcripts show significant variation over time, and the other half not. I know that filtering for DE is not recommended, but could it make sense in this case, as the removed transcripts do not vary significantly and thus do not form meaningful correlating clusters?
Now I want to check for correlations between host and virus and so I'm combining the two sets of transcripts again, after performing vst on them separately. I'm wondering, if this is a legit approach.
Thank you for any comments.