Question

How to explain how DESeq2 works to someone with zero bioinformatics background?

0

Entering edit mode

wanziyi89 • 0

@wanziyi89-12985

Last seen 7.2 years ago

Hi everyone,

Just a very very odd question. I have finished a differential gene expression analysis using the DESeq2 package. I got my list of DEGs, foldchanges, volcano plots etc.

Now I have a major headache: How to present about how DESeq2 works to my PI, who has limited knowledge on bioinformatics. I don't want to smoke my way through the results presentations but I want to ~~educate~~ share the knowledge with my PIs and also several of my colleagues who also has zero bioinformatics background.

Anyone have any prior background/experience with this kind of situation? Mind sharing some tips?

regards,

Kenta.

deseq2 • 2.6k views

ADD COMMENT • link updated 7.2 years ago by Michael Love 42k • written 7.2 years ago by wanziyi89 • 0

score 10 · Accepted Answer · 2017-05-08

Hi Kenta,

This is an answer I posted to Biostars a while ago:

We really tried to write the main text of the paper such that it would be understood by non-statisticians. that said, I'll try to do it in a few sentences. before further questions on details please at least *try* to read the paper :)

Let's say we want to compare counts between two groups. We build a model for the observed counts. This model has some parameters: (1) a normalization parameter, for differences in library size at least, or it can be extended by other software; (2) a variance parameter, called dispersion; (3) parameters representing the group differences. Fit (1) using the same method from the original DESeq. Fit (2) in two steps: first find the value of the parameter that makes the likelihood largest, which is called maximum likelihood estimation. Look at all the values from all of the genes and move these values towards a middle value. Bayes theorem guides the amount of movement for each gene: if the information for the gene is low, the value is moved more to the middle, if the information for the gene is high, the value is moved very little. Fit (3) using the same technique as used for (2). The values for (3) are a useful final product, as are sets of genes where the group differences are likely to be above a threshold (zero or otherwise). These sets are defined by their false discovery rate.

https://www.biostars.org/p/127756/#127941

Let me know if this helps or if you have further questions (you can reply with comment here on Bioconductor support site).