Question

How to judge DESeq2_block and EdgeR_block result

0

Entering edit mode

ew152 • 0

@ew152-14361

Last seen 7.4 years ago

Hi,

I am using the Diffbind package to do the differential binding analysis for my ATAC-seq library. I followed the instruction to use blocking methods for my library, because I assumed there were substantial batch effects. Since I didn't know which one of DESeq2_block and Edger_block would generate the best result for me, I tried my library with both of the normalization methods. I characterized log(fold change) > 1 and p-value <0.05 as "positively enriched peaks", and log(fold change) < -1 and p-value<0.05 as "negatively enriched peaks". However, DESeq2_block generated 1530 positively enriched peaks and 556 negatively enriched peaks, while EdgeR_block generated 606 positively enriched peaks and 1811 negatively enriched peaks. The 606 positively enriched peaks from EdgeR_block is almost a subset of 1530 positively enriched peaks generated by DESeq2, and no negatively enriched peaks were characterized as positively enriched peaks in the other methods. Therefore I have the following questions:

1) What is the best way to make a judgement call between these two blocking methods?

2) I was advised to look at the normalized data through the "ReportingTools" library, but this library requires a DESeqDataSet or DESeqResults as input. How do I extract the DESeqDataSet from DBA object properly? I tried to follow the manual by typing: atac.analyze$contrasts[[1]]$DESeq2$block$DEdata, but it does not seem to have $DEdata.

3) If I also want to extract the EdgeR object from DBA what should I do?

Thank you so much!

Ergang

My code:

atac.list <- dba(sampleSheet=filename)

atac.count <- dba.count(atac.list)

atac.contrast <- dba.contrast(atac.count, categories=DBA_CONDITION, block=DBA_REPLICATE, minMembers = 2)

atac.analyze <- dba.analyze(atac.contrast, method=DBA_ALL_METHODS)

###try to see the normalized signal values for a peak

dds <-atac.analyze$contrasts[[1]]$DESeq2$block$DEdata

des2Report <- HTMLReport(shortName = "Differential_analysis_screening_library",
title = 'differential analysis of screening library using DESeq2', reportDirectory ="./DESeq2/reports")
publish(dds,des2Report, pvalueCutoff=0.05,annotation.db=NULL,
factor = colData(dds)$facs, reportDir="./DESeq2/reports")
finish(des2Report)

diffbind Deseq2 edger differential binding analysis • 1.7k views

ADD COMMENT • link updated 8.2 years ago by Rory Stark ★ 5.3k • written 8.2 years ago by ew152 • 0

score 1 · Answer 1 · 2017-12-15

1

Entering edit mode

Rory Stark ★ 5.3k

@rory-stark-5741

Last seen 13 months ago

Cambridge, UK

The difference is most likely due to differences in normalization. the TMM normalization used by edgeR assumes that most of the regions do not change enrichment levels. If this is not the case, the edgeR result will be the incorrect one. You can check by plotting three MA plots:

> par(mfrow=c(2,2)) 
> dba.plotMA(atac.analyze, th=0,  bNormalized=FALSE)
> dba.plotMA(atac.analyze, method=DBA_DESEQ2_BLOCK)
> dba.plotMA(atac.analyze, method=DBA_EDGER_BLOCK)

If the densest part of the non-normalized plot is distinctly above or below the zero line, then you should use the DESeq2 version (with the default bFullLibrarySize=TRUE).

2. To retrieve the edgeR DGEList object and/or the DESeq2 DESeqDataSeq object, you should set bReduceObjects=FALSE in the call to dba.analyze():

DESeq2 object: atac.analyze$contrasts[[1]]$DESeq2$block$DEdata

edgeR object: atac.analyze$contrasts[[1]]$edgeR$block

Cheers-

Rory

ADD COMMENT • link 8.2 years ago Rory Stark ★ 5.3k

1

Entering edit mode

To expand on Rory's answer: I should note that it is easy to achieve library size normalization in edgeR, simply by not performing TMM normalization, i.e., not running calcNormFactors. There's nothing mandatory about running calcNormFactors in order to use edgeR, it just happens to be what we do to normalize RNA-seq data. In fact, you can (and I often do) put other normalization factors in there, optimized for a given type of data, e.g., Hi-C or ChIP-seq.

ADD REPLY • link 8.2 years ago Aaron Lun ★ 29k

0

Entering edit mode

Hey Rory,

Thank you so much for your help!

I made three MA plots according to your suggestion. I think it is a little bit tricky to tell if the non-normalized plot is above/below the zero line. Do you mind taking a look of the three graphs? I can email those to you.

Best,

Ergang

ADD REPLY • link 8.2 years ago ew152 • 0

0

Entering edit mode

Yes, I can look, send them to me (or post them here).

ADD REPLY • link 8.2 years ago Rory Stark ★ 5.3k