Hi all,
This is more a general (philosophical?) question:
Say I have a dataset analyzed with DeSeq default, and a vst normalized dataset is obtained.
The VST is then used to generate a final predictive ML model (binary) with only few genes from the VST dataset.
Now I have a few samples that I re-rerun end-to-end with the exact same pipeline as above.
After importing in DeSeq what normalization would you think to use in order to obtain a close VST from the original run?
The goal is to predict correctly the new repeated samples.
I know there are millions of variables in play, but was curious to see what the folks would answer.
Thank you in advance