Unexpected results in the dba.report output files
3
0
Entering edit mode
Silvia • 0
@silvia-7163
Last seen 3.4 years ago
United Kingdom

Hi,
I get unexpected results from the dba.report output, with some FDR values being 1. I also see some inconsistencies between that output and the underlying data passed to that function. For example, this is a part of the dba.report output, called using the command dba.report(db_analysis, contrast=1, th=1, file="sites"):

"1",201498174,201504025,9.39,9.14,9.6,-0.47,1.96e-09,1.61e-06
"3",156802970,156808000,8.34,7.97,8.64,-0.67,2.57e-09,1.94e-06
"19",57344104,57347869,7.07,6.5,7.48,-0.98,2.57e-09,1.94e-06
"1",24563571,24567595,7,6.17,7.52,-1.35,2.57e-09,1.94e-06
"7",123064873,123071974,9.82,9.53,10.07,-0.53,2.58e-09,1.94e-06
"16",71651236,71652333,4.22,5.17,0.18,4.99,2.96e-09,1
"8",82533007,82539064,7.6,7.09,7.98,-0.89,3.13e-09,2.26e-06
"9",27466888,27470911,7.22,6.5,7.7,-1.2,3.21e-09,2.26e-06
"4",176021229,176026923,6.4,6.93,5.54,1.39,3.24e-09,2.26e-06
"5",14410122,14418012,9.8,10.01,9.55,0.46,3.29e-09,2.26e-06
"15",42861570,42864785,7.43,6.88,7.83,-0.95,3.4e-09,2.26e-06

...in which the peak on chromosome 16 has an unexpected FDR of 1. However, if I extract the original values corresponding to those same lines from the same contrast in my DBA object (using the command db_analysis$contrasts[1][[1]][[5]]$de) I get the following results:

3748   3748 1.958108e-09 1.610824e-06
313     313 2.574405e-09 1.939354e-06
23318 23318 2.567537e-09 1.939354e-06
30796 30796 2.565474e-09 1.939354e-06
42381 42381 2.581988e-09 1.939354e-06
16691 16691 3.397763e-09 2.259378e-06
34115 34115 3.244742e-09 2.259378e-06
34361 34361 3.287747e-09 2.259378e-06
45122 45122 3.131840e-09 2.259378e-06
46641 46641 3.213740e-09 2.259378e-06
47711 47711 3.400413e-09 2.259378e-06

Unfortunately I cannot send you the input files (both for their size and because it's still unpublished data), so hope this could help you finding what's going on with this command. The R version I'm using is 3.4.0 (2017-04-21), whereas the version of DiffBind is 2.4.8 .

Thanks!

Silvia

diffbind • 1.6k views
ADD COMMENT
0
Entering edit mode
Rory Stark ★ 5.2k
@rory-stark-5741
Last seen 5 weeks ago
Cambridge, UK

Hi Silvia-

Turns out this is a bug in DiffBind relating to some changes made somewhere along the way in DESeq2.

Specifically, dba.report() returns the report sorted by p-value. However in certain cases, some sites that have low p-values are excluded from multiple testing correction by DESeq2 and are assigned a FDR value of 1.0.

I'm looking at the best way to address this and check in a bugfix soon for DiffBind 3.6

There are a couple of workarounds. One is to use a FDR threshold less than 1.0 (for example th=.99), which will exclude all the sites with FDR=1.0 (which you are probably not interested in anyway). The other way is to re-sort the report by FDR:

> dbreport <- dba.report(myDBA, th=1)
> dbreport <- dbreport[order(dbreport$FDR),]

In this case, there will still be a problem if you write the report out within dba.report(), so you would need to write out the returned, re-sorted report yourself as a .csv file.

I'll let you know when I've checked in a fix.

Cheers-

Rory

ADD COMMENT
0
Entering edit mode
Silvia • 0
@silvia-7163
Last seen 3.4 years ago
United Kingdom

Thanks a lot Rory for looking into this.

However, after having a second look at your reply I got a bit confused... You said that the error of writing FDR=1 to the output file happens for some sites that have low p-values (did you mean 'high', btw?). The p-value in the example above corresponding to FDR=1 is 2.96E-09, which is actually very significant and it would therefore not be a good idea to discard that site. Did you mean that in cases like this, the corresponding p-value is also erroneous/wrong?

Cheers,

Silvia

ADD COMMENT
0
Entering edit mode

DESeq2 is indeed assigning an FDR of 1.0 to a site with a p-value of 2.96E-09. As I understand it, this is due to their implementation of independent filtering, which is on by default. See the man page for DESeq2::results for more details, and the associated reference Independent filtering increases detection power for high-throughput experiments.

I'm planning on taking a closer look at independent filtering in DiffBind, as it should either a) change the alpha parameter based on the desired FDR threshold or b) possibly turn off independent filtering entirely.

Cheers-

Rory

ADD REPLY
0
Entering edit mode
ck15115 • 0
@ck15115-14737
Last seen 6.9 years ago

I am experiencing the same problem with DiffBind3.6. When do you think a bug-fixed version will be released?

ADD COMMENT
0
Entering edit mode

The simple fix (sorting by FDR instead of p-value) is there now, DiffBind 2.6.5.

-R

ADD REPLY
0
Entering edit mode

Thanks for that, although it still doesn't solve the problem of (quite some) very significant regions being discarded because of their (post-filtering modified) FDRs. It would be ideal to turn off this extra-filtering step done by DESeq2 within DiffBind. Is there an anticipated release date for the next version?

ADD REPLY

Login before adding your answer.

Traffic: 793 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6