Question

DeSeq Strategies for run to run Replication

0

Entering edit mode

Giorgio • 0

@b0635dc4

Last seen 20 months ago

United States

Hi all,

This is more a general (philosophical?) question:

Say I have a dataset analyzed with DeSeq default, and a vst normalized dataset is obtained.
The VST is then used to generate a final predictive ML model (binary) with only few genes from the VST dataset.
Now I have a few samples that I re-rerun end-to-end with the exact same pipeline as above.
After importing in DeSeq what normalization would you think to use in order to obtain a close VST from the original run?
The goal is to predict correctly the new repeated samples.

I know there are millions of variables in play, but was curious to see what the folks would answer.

Thank you in advance

DESeq2 Normalization Clustering replicate • 632 views

ADD COMMENT • link updated 20 months ago by Michael Love 41k • written 20 months ago by Giorgio • 0

score 0 · Answer 1 · 2022-08-25

You can use the same VST on the new samples. We have some help to do this in ?varianceStabilizingTransformation:

The variance stabilizing transformation from a previous dataset can be "frozen" and reapplied to new samples. The frozen VST is accomplished by saving the dispersion function accessible with dispersionFunction, assigning this to the DESeqDataSet with the new samples, and running varianceStabilizingTransformation with 'blind' set to FALSE. Then the dispersion function from the previous dataset will be used to transform the new sample(s).