DESeq2 output explanation
1
4
Entering edit mode
Last seen 6.8 years ago
European Union

Hello!

I have a question about the results of differential expression analysis in deseq2.

I did:

DEgenes=results(DEcd, contrast=c("clD", "V", "VG"))

and I got the table with the log2 fold change and the p values. But what I didn't understand is to which group the fold change refer to. So if I have a negative log2 fold change, it means that the gene is down-regulated, but in which of the 2 groups of samples? Where can I find this information?

Many thanks

Francesca

deseq deseq2 • 35k views
4
Entering edit mode
@mikelove
Last seen 22 hours ago
United States

We describe the interpretation of results in a few places which you might find useful. Check the section "More information on results columns" in the software vignette:

vignette("DESeq2")

and also the "Building the results table" section of the workflow (this has a slower pace than the vignette and might be helpful to look over):

http://www.bioconductor.org/help/workflows/rnaseqGene/

0
Entering edit mode

Hi Michael,

thanks for your reply. I read both the vignette and the tutorial, but I still didn't find the information I'm looking for.

If I extract the results for a specific contrast, let see A vs B, how can I know if the log2 fold changes are referred to A or B?

0
Entering edit mode

A positive log2 fold change for a comparison of A vs B means that gene expression in A is larger in comparison to B.

Here's the section of the vignette

"For a particular gene, a log2 fold change of −1 for condition treated vs untreated means that the treatment induces a change in observed expression level of 2^−1 = 0.5 compared to the untreated condition."

Here's the section of the workflow

"The column log2FoldChange is the effect size estimate. It tells us how much the gene's expression seems to have changed due to treatment with dexamethasone in comparison to untreated samples. This value is reported on a logarithmic scale to base 2: for example, a log2 fold change of 1.5 means that the gene's expression is increased by a multiplicative factor of 2^1.5≈2.82."

0
Entering edit mode

Hi Michal

I have a further question. I read I can use the rlog transformation and use those values for heatplots or pca. Do I need to use raw counts as input for rlog or do I need to normalize for library size before (diving the reads for each gene by the total reads of the sample)

thanks!

0
Entering edit mode

Always check the documentation first, by typing the function name with a question mark in front: ?rlog

The help file tells you:

"This function transforms the count data to the log2 scale in a way which minimizes differences between samples for rows with small counts, and which normalizes with respect to library size."

The vignette (accessible via vignette("DESeq2")) section on transformations says:

"Both transformations produce transformed data on the log2 scale which has been normalized with respect to library size."

So the rlog function takes care of normalization for library size; you do not provide the rlog with normalized counts or non-integer values.

0
Entering edit mode

In plain English ,I have a comparison HSC and LSC where HSC is my control , Im comparing HSC vs LSC , so if the fold change is positive it means the gene is high in HSC ?

1
Entering edit mode

If HSC is control, nearly all (perhaps all) R/Bioc packages and analysts would expect you to set HSC as the reference level and report LSC vs HSC (read: log (LSC / HSC) as the LFC. This is also printed at the top of the results table when you print it to console if you follow the guidelines in the vignette on setting factor levels.

Also see the workflow (rnaseqGene package) which explains how to interpret the sign of the LFC.

0
Entering edit mode

"would expect you to set HSC as the reference level " yes this i had done . "report LSC vs HSC (read: log (LSC / HSC) as the LFC" thank you for clarifying me in simple words it was bit confusing although i might have ran your library more than 100 times

0
Entering edit mode

Hi Michael,

I have some doubt in the explanation "A positive log2 fold change for a comparison of A vs B means that gene expression in A is larger in comparison to B.'

Does it have anything to do with alphabetical order of the condition? Like, I have "Reponse" and "Non-response" in my sampledata. How to know if the increased positive  fold change is for Response group or no-response group?

Note : I am using Deseq with phyloseq , http://joey711.github.io/phyloseq-extensions/DESeq2.html

Thanks,

Reeba

1
Entering edit mode

See this section of the vignette:

https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#note-on-factor-levels

There are three ways to know:

You can specify the reference level as in the above link.

You can specify the contrast explicitly when you call results() by using the 'contrast' argument.

Finally, when you print the DESeqResults table, it has the information printed at the top, see here:

https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#differential-expression-analysis

...
log2 fold change (MLE): condition treated vs untreated
​...

0
Entering edit mode

Thanks Michael. That worked.

0
Entering edit mode

Hi Michael,

Thanks for your clear explanation. But does it not make more sense if one can get gene expression value (normalized) from both condition to its corresponding log fold change?

May be I miss it in the manual but can you direct me to get gene expression value of each sample and also for each replicates?

0
Entering edit mode