The editor has been updated to markdown! Please see more info at: Tutorial: Updated Support Site Editor

Question: Building the correct model for DESeq2
gravatar for allisondaly13
25 days ago by
allisondaly130 wrote:


I am trying to find differentially expressed genes in my RNA-Seq data. I have noticed that based on the count matrix I provide, whether it contains only the direct comparison or more of the data, I get different p-values.

For example: I am comparing (0.5h stimulated to 0.5h unstimulated) and (1.0h stimulated to 1.0h unstimulated). 1) I can provide a count matrix which has counts for all four conditions. Then, I look at the p-values for 0.5 stimulated vs 0.5 unstimulated. OR 2) I can provide a count matrix which only has the counts for the conditions I am directly comparing ( 0.5h stimulated vs 0.5h unstimulated). When I compare the p-values between the two methods they do not match.

I am wondering which is better/ more accurate/ more true to the data?

Thanks! Allison

deseq2 • 63 views
ADD COMMENTlink modified 24 days ago by James W. MacDonald49k • written 25 days ago by allisondaly130
Answer: Building the correct model for DESeq2
gravatar for James W. MacDonald
24 days ago by
United States
James W. MacDonald49k wrote:

In general, if you have more data in your model the deviance estimates are better, which tends to give more accurate results. As with all things there are tradeoffs here, as you are borrowing information from the data that you aren't making comparisons with. If there is a good reason to think that the different groups should have very different variabilities, then you might not want to combine, but that's up to you as the analyst.

ADD COMMENTlink written 24 days ago by James W. MacDonald49k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 206 users visited in the last hour