Question

MA plot DiffBind in ggplot2

0

Entering edit mode

reubenmcgregor88 • 0

@reubenmcgregor88-13722

Last seen 4.2 years ago

Hi again, sorry lots of questions at the moment,

I would like to replicate the following function from diffBind

> dba.plotMA(tamoxifen, bXY=TRUE)

However I don't know how to retrieve this data, in Deseq2 I believe it would be the baseMean column. Or is the "Conc" column output by "dab.report" the values used for the plot ma?

I know one option would be:

> DBA$contrasts[[n]]$DESeq2$DEdata, bReduceObjects=F

Where I assume DBA is the DBA object, but what is "n"??

Also, on a separate note, is the "Fold" column output by "dba.report" as default the log2 fold values?

Thanks

diffbind ggplot2 r chipseq • 2.6k views

ADD COMMENT • link updated 7.5 years ago by Rory Stark ★ 5.2k • written 7.6 years ago by reubenmcgregor88 • 0

0

Entering edit mode

Rory Stark ★ 5.2k

@rory-stark-5741

Last seen 12 weeks ago

Cambridge, UK

This is an artifact of reporting log values of normalized data. After normalization, some of the values may be less than one and hence will have negative log2 scores. So you are correct that if the normalized read counts for a peak in the replicates of one condition are very small (negative log values), add the counts for the replicates of the other condition are consistently higher, the peak can be identified as being differentially bound (with low FDR).

-Rory

ADD COMMENT • link 7.5 years ago Rory Stark ★ 5.2k

score 2 · Accepted Answer · 2017-09-11

2

Entering edit mode

Rory Stark ★ 5.2k

@rory-stark-5741

Last seen 12 weeks ago

Cambridge, UK

Yes, the MA plot data is the same as reported by dba.report(). The X axis is the "Conc" column, and the Y axis is the "Fold" column. These are all reported in log2 form, so that Fold is simply Conc1-Conc2.

I'm not sure what you mean in the bReduceObject=F line of code?

-Rory

ADD COMMENT • link 7.6 years ago Rory Stark ★ 5.2k

0

Entering edit mode

Thanks Rory,

Related question, in the dba.report() output, what are the column headers named after the conditions (i.e. Conc_condition1, Conc_condition2). I assume these are what is used in the dba.plotMA the bXY=TRUE and are (as you explained above) log2 transformed values, but form where are these values derived? I ask as I have some negative values from peaks which where identified as differentially expressed.

Am I interpreting that right if I say that they are very low expressing (below 1 in the non log2 values) peaks, but which where consistently (in all replicates) increased or decreased, hence being identified as significantly differentially expressed?

ADD REPLY • link 7.6 years ago reubenmcgregor88 • 0