Hello,
Can anyone here recommend any posthoc normalisation steps following deseq2 before generating Partial Least Squared Discriminant Analysis (PLSDA) and VIP (Variance of Importance Plots)?
I presume it's not appropriate to just perform this code and use it to generate a PLSDA with these counts?
Generates DeSeqDataSet Normalisedcounts and prints as CSV
dds <- estimateSizeFactors(dds) Normalisedcounts<-counts(dds, normalized=TRUE) write.csv(Normalisedcounts, file="27_Norm.csv")
Apologies if this seems like an uneducated question but I'm new to both 'big data' statistics and deseq2.
Thanks for your feedback on this I've now read up on rlog and vsd in the vignette, can I confirm that the VST is a transformation after normalisation according to size factors. As this is my interpretation from your vignette below:
"Above, we used a parametric fit for the dispersion. In this case, the closed-form expression for the variance stabilizing transformation is used by the vst function. If a local fit is used (option fitType="locfit" to estimateDispersions) a numerical integration is used instead. The transformed data should be approximated variance stabilized and also includes correction for size factors or normalization factors. The transformed data is on the log2 scale for large counts."
If so would this code be correct for generating an VST and size factor normalised output from your package?
Code for generating a VST and outputting to CSV
Yes but to be clear I would do:
vsd
in our documentation denotes a variance stabilized dataset, which means it has attached phenotypic data. I usemat
when it's a simple matrix.