deseq2 normalization and statistical method
Entering edit mode
SH • 0
Last seen 6 weeks ago
South Korea

Hi. there are two issues that I have.

  1. Raw read count used as input data for deseq(). Is it really okay? I referenced that information on following website. But, I want to try normalizing raw count into RLE(Relative Log Expression).

  2. What does default statistical method of DEseq()? nbinomialwaldtest?

thank you for your help.

# setup
condition <- factor(c(c,c,c,t,t,t))
coldata <- data.frame(row.names = colnames(Counts),condition)
dds <- DESeqDataSetFromMatrix(countData = Counts, colData = coldata, design = ~condition)

# Estimate size factors using Relative Log Expression method
dds <- estimateSizeFactors(dds, type = c("ratio")) <------------- I thought this ratio type is about RLE normalization. Is that right?
dds <- estimateDispersions(dds)
dds <- nbinomWaldTest(dds) <--------- Is this same with just "DESeq(dds)"?

# Retrieve normalized counts
normcounts <- counts(dds, normalized=TRUE) <----------- How can I apply this to dds? dds$countData <- normcounts?

dds <- DESeq(dds)
res <- results(dds,contrast = c("condition",t,c))
normalization deseq2 StatisticalMethod • 333 views
Entering edit mode
ATpoint ★ 2.9k
Last seen 9 hours ago

This is all answered in the vignette which you even reference (what DESeq() does, how to get normalized counts and to use raw counts). I also answered your other question before that default normalization is what people nowadays call RLE deseq2 RLE normalization

You even reference the vignette, but if you don't trust it (written by the developer) then I don't see what anyone could say to help you.

Entering edit mode

Thank you for answering my repeated question. Although the documentation states that normalization is not necessary, I was confused because some other parts of the documentation suggested normalization methods.

In fact, I outsourced the RNA sequencing analysis. However, there was a difference between the results analyzed by the company and the results analyzed by me, so I inquired about it. The company does not provide specific information on the grounds of security. The functions used by the company are DESeqDataSetFromMatrix and estimateSizeFactors for RLE normalization, followed by estimateDispersions and nbinomWaldTest.

Therefore, the differences between the results analyzed by me and the company's analysis method are as follows:

  1. Some genes are missing.
  2. For genes that are not missing, the p-value is the same.
  3. The company has many |logFC|>1.5, but the number is small in my analysis results.

I think the difference is due to normalization. When using the above functions, I did not use any special options. However, I do not know what options the company used. Have you had a similar experience?

Entering edit mode

Either the company provides you full code and software versions or (imo) their anslysis is basically useless as not reproducible.


Login before adding your answer.

Traffic: 296 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6