#### The support.bioconductor.org editor has been updated to markdown! Please see more info at: Tutorial: Updated Support Site Editor

Question: Recommendations for normalisation post-deseq2
0
9 days ago by
marc.osullivan0 wrote:

Hello,

Can anyone here recommend any posthoc normalisation steps following deseq2 before generating Partial Least Squared Discriminant Analysis (PLSDA) and VIP (Variance of Importance Plots)?

I presume it's not appropriate to just perform this code and use it to generate a PLSDA with these counts?

# Generates DeSeqDataSet Normalisedcounts and prints as CSV

dds <- estimateSizeFactors(dds) Normalisedcounts<-counts(dds, normalized=TRUE) write.csv(Normalisedcounts, file="27_Norm.csv")

Apologies if this seems like an uneducated question but I'm new to both 'big data' statistics and deseq2.

modified 9 days ago • written 9 days ago by marc.osullivan0
1
9 days ago by
Michael Love22k
United States
Michael Love22k wrote:

I don't know what these plots are.

Take a look at the vignette and workflow for DESeq2. We have counts(dds, normalized=TRUE) which just scales the counts for library size.

There are also transformations which are good for computing distances between samples and plotting. There is a lot of information on these in the vignette and workflow (and the publication), so check there first and then come back here with any remaining questions.

Thanks for your feedback on this I've now read up on rlog and vsd in the vignette, can I confirm that the VST is a transformation after normalisation according to size factors. As this is my interpretation from your vignette below:

"Above, we used a parametric fit for the dispersion. In this case, the closed-form expression for the variance stabilizing transformation is used by the vst function. If a local fit is used (option fitType="locfit" to estimateDispersions) a numerical integration is used instead. The transformed data should be approximated variance stabilized and also includes correction for size factors or normalization factors. The transformed data is on the log2 scale for large counts."

If so would this code be correct for generating an VST and size factor normalised output from your package?

Code for generating a VST and outputting to CSV

vsd <- assay(varianceStabilizingTransformation(dds, blind=FALSE))
write.csv(vsd, file="27_Norm2.csv")

1

Yes but to be clear I would do:

vsd <- vst(dds, blind=FALSE)
mat <- assay(vsd)
write.csv(mat, file="file.csv")


vsd in our documentation denotes a variance stabilized dataset, which means it has attached phenotypic data. I use mat when it's a simple matrix.