Search
Question: Unexpected results in the dba.report output files
0
gravatar for Silvia
27 days ago by
Silvia0
Silvia0 wrote:

Hi,
I get unexpected results from the dba.report output, with some FDR values being 1. I also see some inconsistencies between that output and the underlying data passed to that function. For example, this is a part of the dba.report output, called using the command dba.report(db_analysis, contrast=1, th=1, file="sites"):

"1",201498174,201504025,9.39,9.14,9.6,-0.47,1.96e-09,1.61e-06
"3",156802970,156808000,8.34,7.97,8.64,-0.67,2.57e-09,1.94e-06
"19",57344104,57347869,7.07,6.5,7.48,-0.98,2.57e-09,1.94e-06
"1",24563571,24567595,7,6.17,7.52,-1.35,2.57e-09,1.94e-06
"7",123064873,123071974,9.82,9.53,10.07,-0.53,2.58e-09,1.94e-06
"16",71651236,71652333,4.22,5.17,0.18,4.99,2.96e-09,1
"8",82533007,82539064,7.6,7.09,7.98,-0.89,3.13e-09,2.26e-06
"9",27466888,27470911,7.22,6.5,7.7,-1.2,3.21e-09,2.26e-06
"4",176021229,176026923,6.4,6.93,5.54,1.39,3.24e-09,2.26e-06
"5",14410122,14418012,9.8,10.01,9.55,0.46,3.29e-09,2.26e-06
"15",42861570,42864785,7.43,6.88,7.83,-0.95,3.4e-09,2.26e-06

...in which the peak on chromosome 16 has an unexpected FDR of 1. However, if I extract the original values corresponding to those same lines from the same contrast in my DBA object (using the command db_analysis$contrasts[1][[1]][[5]]$de) I get the following results:

3748   3748 1.958108e-09 1.610824e-06
313     313 2.574405e-09 1.939354e-06
23318 23318 2.567537e-09 1.939354e-06
30796 30796 2.565474e-09 1.939354e-06
42381 42381 2.581988e-09 1.939354e-06
16691 16691 3.397763e-09 2.259378e-06
34115 34115 3.244742e-09 2.259378e-06
34361 34361 3.287747e-09 2.259378e-06
45122 45122 3.131840e-09 2.259378e-06
46641 46641 3.213740e-09 2.259378e-06
47711 47711 3.400413e-09 2.259378e-06

Unfortunately I cannot send you the input files (both for their size and because it's still unpublished data), so hope this could help you finding what's going on with this command. The R version I'm using is 3.4.0 (2017-04-21), whereas the version of DiffBind is 2.4.8 .

Thanks!

Silvia

ADD COMMENTlink modified 17 days ago • written 27 days ago by Silvia0
0
gravatar for Rory Stark
18 days ago by
Rory Stark2.1k
CRUK, Cambridge, UK
Rory Stark2.1k wrote:

Hi Silvia-

Turns out this is a bug in DiffBind relating to some changes made somewhere along the way in DESeq2.

Specifically, dba.report() returns the report sorted by p-value. However in certain cases, some sites that have low p-values are excluded from multiple testing correction by DESeq2 and are assigned a FDR value of 1.0.

I'm looking at the best way to address this and check in a bugfix soon for DiffBind 3.6

There are a couple of workarounds. One is to use a FDR threshold less than 1.0 (for example th=.99), which will exclude all the sites with FDR=1.0 (which you are probably not interested in anyway). The other way is to re-sort the report by FDR:

> dbreport <- dba.report(myDBA, th=1)
> dbreport <- dbreport[order(dbreport$FDR),]

In this case, there will still be a problem if you write the report out within dba.report(), so you would need to write out the returned, re-sorted report yourself as a .csv file.

I'll let you know when I've checked in a fix.

Cheers-

Rory

ADD COMMENTlink written 18 days ago by Rory Stark2.1k
0
gravatar for Silvia
17 days ago by
Silvia0
Silvia0 wrote:

Thanks a lot Rory for looking into this.

However, after having a second look at your reply I got a bit confused... You said that the error of writing FDR=1 to the output file happens for some sites that have low p-values (did you mean 'high', btw?). The p-value in the example above corresponding to FDR=1 is 2.96E-09, which is actually very significant and it would therefore not be a good idea to discard that site. Did you mean that in cases like this, the corresponding p-value is also erroneous/wrong?

Cheers,

Silvia

ADD COMMENTlink written 17 days ago by Silvia0

DESeq2 is indeed assigning an FDR of 1.0 to a site with a p-value of 2.96E-09. As I understand it, this is due to their implementation of independent filtering, which is on by default. See the man page for DESeq2::results for more details, and the associated reference Independent filtering increases detection power for high-throughput experiments.

I'm planning on taking a closer look at independent filtering in DiffBind, as it should either a) change the alpha parameter based on the desired FDR threshold or b) possibly turn off independent filtering entirely.

Cheers-

Rory

ADD REPLYlink modified 16 days ago • written 16 days ago by Rory Stark2.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 204 users visited in the last hour