DESeq2 Ignore Some Significant Differential Genes due to Extreme Count Outlier
1
1
Entering edit mode
ijayden.lung ▴ 10
@ijaydenlung-16368
Last seen 5.8 years ago

For example, the read count of gene Z in condition A is 0, 0, 0, 123 and in condition B is 13000,13500,12500,14000. It is obvious that gene Z is significantly differential express between condition A and condition B, however, sometimes, DESeq2 set the pvalue to NA because of the outlier in condition A. 

I suggest that before filtering the outliers, firstly testing whether the maximum value in condition A  is far less than the minimum value in condition B. To do this, DESeq2 can reduce the false negative rate.

deseq2 • 368 views
ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 34 minutes ago
United States

Thanks. I'll take a look.

The outlier filtering procedure is not ideal and I mean to address it with a systematic fix.

You can turn the default outlier filtering off with cooksCutoff=FALSE in results()

ADD COMMENT
0
Entering edit mode

Thanks for the report. I added a fix for this in the development branch (v.1.21). It's a heuristic for the simple two group design to not filter based on Cook's distance in a case like this.

ADD REPLY

Login before adding your answer.

Traffic: 804 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6