I have been using DESeq2 package for RNA-sq data analysis and really like the VST data in log2 units. But was unsure about usage of VST data for certain analyses.
Specifically, can the VST data be used to calculate a gene signature score (average across all the genes in a given signature) with the aim of comparing signature scores with or without a given condition? I have generated batch-effect corrected VST data using DESeq2 and LIMMA.
I understand VST data doesn’t take into account gene length whereas TPM does and may not be used to compare expression across genes.
Thanks for any feedback !