Hi,
I'm using DESeq2 on a fairly large data set with continuous variables. I've looked through the vignette and various questions within the forum to see how best to handle outliers when using continuous variables in the model. The vignette states the following:
`Note that with continuous variables in the design, outlier detection and replacement is not automatically performed, as our current methods involve a robust estimation of within-group variance which does not extend easily to continuous covariates. However, users can examine the Cook’s distances in assays(dds)[["cooks"]]
, in order to perform manual visualization and filtering if necessary.`
Rather than immediately throw out all genes with a given Cook's value, and to avoid making thousands of plots for each gene, I'd like to apply the `replaceOutliers` function to genes with a given Cook's value, and then see which genes are still significant.
Is there an easy way to apply `replaceOutliers` based on a given Cook's value? Or is there an inherent problem with that approach?
Just for kicks, I tried running `DESeq`, followed by `replaceOutliers` and didn't see any differences. I then tried re-running `DESeq` after `replaceOutliers` to see if it would stick, but didn't seem to make a difference.
Really appreciate your help.