Question: Data normalization and transformation using DESeq2
chiara.facciotto0 wrote:

Hi,

I am trying to use DESeq2 to perform normalization of my RNA-Seq data. I am wondering if, after estimating the size factors, I can directly run the rlog or vst transformation, or if I also need to use the count function in between. Basically I wonder which is the correct one between option 1 or option 2 of the codes reported below (in bold I highlighted the difference between the two scripts).

Thank you very much for your help!!!

Option 1:

  # Import data
dds <- DESeqDataSetFromMatrix(countData = counts, colData = colData, design = ~ 1)

# Pre-filtering
dds <- dds[ rowSums(counts(dds)) > 0, ]

# Estimate factor for normalization
dds <- estimateSizeFactors(dds)

# Compute log2 counts
rld <- rlog(dds, blind=FALSE)
table.out <- assay(rld)

Option 2:

  # Import data
dds <- DESeqDataSetFromMatrix(countData = counts, colData = colData, design = ~ 1)

# Pre-filtering
dds <- dds[ rowSums(counts(dds)) > 0, ]

# Estimate factor for normalization
dds <- estimateSizeFactors(dds)
dds <- counts(dds, normalized=TRUE)

# Compute log2 counts
rld <- rlog(dds, blind=FALSE)
table.out <- assay(rld)
Michael Love25k
United States
Michael Love25k wrote:

The counts() function doesn't do any normalization, it just returns a matrix. So you don't want to do this: dds <- counts(dds, ...), because you've just replaced a DESeqDataSet (which has a lot of information in it) with a count matrix (which has less information). You've discarded all the information about genes and samples.

chiara.facciotto0 wrote:

Thanks! And is it ok to run estimateSizeFactors(dss) before running rlog(dds, blind=FALSE)?

Yes, you should. But if you haven't done that already, the rlog function will do that internally.