DESeq2: What is the unit of DESeq2 normalized read count (VST)? Is it tag per million?
Entering edit mode
Last seen 2.9 years ago

Hi, I am using the DESeq2 (DESeq2_1.22.2) VST algorithm to normalize the tag count within peaks from CAGE-seq data. I want to use the VST transformed counts in peaks to see the change of peak activity across cell lines and to determine the cell line-specific peaks. I want to "normalize counts" across samples for cross-sample comparison of peak activity and want to have "normalized counts per million" to determine cell-line specific peaks which are >1 TPM.

I thought the VST transformed read count was the right way to go because the VST considers the size factor/dispersion to normalize the count and the unit of VST transformed read count is "count-per-million" (according to the post by Ryan C. Thompson ub

However, when I added all VST normalized peak count per cells, the sum values were in the range of 10-20 million, which is 10-20 times larger than my expectation.

Here is my questions. 1) Is the unit of VST normalized peak count "count-per-million"? If then, what are possible explanation for my 10-20 million VST transformed read count per cell/ 2) What is the pseudocount used in VST? In the DEseq2 document, I couldn't find the pseudocount for VST. Is there no pseudocount for VST?

Best regards, Ju Heon Maeng

deseq2 vst rlog • 1.4k views
Entering edit mode
Last seen 5 days ago
United States

VST is approximately log2 of scaled counts (as the counts become larger it converges to this).

So it's not CPM or anything like this, but counts which are scaled to the middle range of sequencing depth in your dataset. So it's the log2 of a count, if that sample was sequenced in the middle range in terms of depth.

There is no pseudocount used in VST. See the DESeq (2010) paper for description of the transformation.


Login before adding your answer.

Traffic: 537 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6