I would like to do differential gene expression analysis and WGCNA usinf paired-end RNA seq data. However,I confused a lot related to using same pre-processing for both DGE & WGCNA analysis.
In summary, I have 2 experimental condition and each condition has 18 samples. However, they are sequenced in 3 batches. 1 st batch 6 pair (disease & control), 2nd batch 6 pair (disease & control), 3rd batch 6 pair (disease & control). Each batch done by different people over different times so I believe we have batch effect. As I am using DESeq2 for my analysis I set my model design as “Batch+Type”. I was normally using removeBatchEffect from limma package and plotting PCA & hierarchical clustering plots using spermann correlation to see which samples are outliers.
I have asked before if I should use rlog() for differential gene expression analysis and getVarianceStabilizedData() for WGCNA or can I use same normalization for both. I got a feedback that it is better to use voom() the removedbatcheffect() as they belong to the same package.Do you agree with this suggestion ?
Now, I am using voom() & removedbatcheffect() for WGCNA analysis but what do you suggest me to do for DGE ? Should I use limma package or DESeq2 package ( rld() or vsd() ) ?
In addition to these, I would like to take your advice about if I should remove any patients from analysis according to these hierarchical clusters:
Same data, different transformation methods( rld, vsd, voom), removed batch effect and plot hierarchical clustering using Spearmann correlation.
I really appreciate if you give me some insight,
Thanks in advance