Question: DESeq2 - VST eliminates zero values
0
gravatar for cafosspot
28 days ago by
cafosspot0
cafosspot0 wrote:

Greetings,

I have been following the DESeq2 vignette to analyze a large number of RNA-seq samples. sizeFactors are relatively comparable for each sample. While investigating the various data transformations for visualization purposes, I found the VST (setting blind=false) more effective than using the log-transformed normalized counts (with a pseudocount of +1) at stabilizing the variance over the mean. Rlog seems to just keep running, which I assume is due to having lots of samples. Since I plan on incorporating even more samples, I was going to stick with the VST.

However, I noticed that the transformation appears to eliminate all the zero values from my counts matrix using these particular data. Comparison with the normalized counts suggests that these zero values are simply being scaled up (the value is identical in each case, ~3.5). Samples with higher counts are also scaled up, as would be expected, and biologically the results appear consistent between both the log-transformed normalized counts and the VST counts.

Heatmaps of specific genes of interest look highly similar between the two, inter-sample distances seem to make sense for both, and PCA of the samples show that samples group in a nearly identical and meaningful way, regardless of the input used.

I found a previous post with a similar issue, though it wasn't definitively answered if this is acceptable or not. I've gone through the vignette and the DESeq2 paper to look for insight, but I'm still not sure I understand fully what's happening here.

Thanks for your time.

deseq2 • 94 views
ADD COMMENTlink modified 28 days ago by Michael Love26k • written 28 days ago by cafosspot0
Answer: DESeq2 - VST eliminates zero values
1
gravatar for Michael Love
28 days ago by
Michael Love26k
United States
Michael Love26k wrote:

What do you mean by "scaled up"?

Take a look at the third plot in this section (three panels of scatterplots)

https://bioconductor.org/packages/release/workflows/vignettes/rnaseqGene/inst/doc/rnaseqGene.html#the-variance-stabilizing-transformation-and-the-rlog

This is the expected behavior of the VST, that 0's are mapped to a certain value instead of to -Infinity.

ADD COMMENTlink written 28 days ago by Michael Love26k

Poor choice of words, I meant that samples with higher counts are still higher after the transformation. I was not visualizing what was happening with the lower counts using the VST correctly. This helps a lot. Thanks very much, Michael.

ADD REPLYlink written 28 days ago by cafosspot0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 146 users visited in the last hour