Question

Likeliness Ratio Test

0

Entering edit mode

csijst • 0

@csijst-15102

Last seen 6.9 years ago

Singapore/National University of Singap…

Hi,

I understand the explanation when we set the design as ~ cell + dex, in which, we want to study the difference between dex treatment among cells. After which, a reduction (in the model) would be done for cell. But I find it hard to understand how will the program (with a fixed algorithm) will be able to show two different sets of p-values (or padj values) if we flip the design. I.e., ~cell + dex and ~ dex + cell, then reduce ~ cell in either scenarios. My interest is to study the treatment.

I am tempted to use Walt test since I am only testing for one condition (treatment), but my colleague (who is a more experienced bioinformatician) strongly advised me to focus on LRT instead. I made the effort to test both tests, and noticed slight changes in the padj values; LRT seems to show borderline significance in genes I am interested in, whereas Walt shows borderline non-significance. This is of course, only observed in this current datasets.

Should I then trust my instincts to use Walt, or follow a more experienced member and focus on LRT?

Thank you.

Regards,

Johann

deseq2 design and contrast matrix • 1.3k views

ADD COMMENT • link 7.8 years ago • updated 7.7 years ago csijst • 0

0

Entering edit mode

Hi Dr Michael,

Thank you! I had the same feeling initially. Because the algorithm is fixed. It's almost like saying 3 + 2 and 2 + 3, if I "reduce" 3, I'd still get 2.

But I was reading in some sites (Bioconductor online manual - Likeliness Ratio Test section; DGE analysis - time course analysis section) on how to conduct LRT, and it seems like I would need to label the last parameter as the parameter of interest. So I wanted to clarify.

Regards,

Johann

ADD REPLY • link 7.7 years ago csijst • 0

score 1 · Answer 1 · 2018-04-04

The order of variables in the full design doesn't matter for fitting the model. It only matters when you go to extract results, in the case that you don't specify any particular coefficient to look at. This is discussed in the vignette. If you just call results(dds), then the software doesn't know which is the variable of interest so the default (here and in other methods) is to look at the last coefficient. In the case of the LRT, the p-value won't change at all between "~cell + dex" vs "~cell" or "~dex + cell" vs "~cell", the only difference you will see when calling results(dds) is which LFC is printed. This is discussed in the LRT section of the help page for ?results. Which test you choose is up to you as the statistical analyst.