Hi,
I am using the Diffbind package to do the differential binding analysis for my ATAC-seq library. I followed the instruction to use blocking methods for my library, because I assumed there were substantial batch effects. Since I didn't know which one of DESeq2_block and Edger_block would generate the best result for me, I tried my library with both of the normalization methods. I characterized log(fold change) > 1 and p-value <0.05 as "positively enriched peaks", and log(fold change) < -1 and p-value<0.05 as "negatively enriched peaks". However, DESeq2_block generated 1530 positively enriched peaks and 556 negatively enriched peaks, while EdgeR_block generated 606 positively enriched peaks and 1811 negatively enriched peaks. The 606 positively enriched peaks from EdgeR_block is almost a subset of 1530 positively enriched peaks generated by DESeq2, and no negatively enriched peaks were characterized as positively enriched peaks in the other methods. Therefore I have the following questions:
1) What is the best way to make a judgement call between these two blocking methods?
2) I was advised to look at the normalized data through the "ReportingTools" library, but this library requires a DESeqDataSet or DESeqResults as input. How do I extract the DESeqDataSet from DBA object properly? I tried to follow the manual by typing: atac.analyze$contrasts[[1]]$DESeq2$block$DEdata, but it does not seem to have $DEdata.
3) If I also want to extract the EdgeR object from DBA what should I do?
Thank you so much!
Ergang
My code:
atac.list <- dba(sampleSheet=filename)
atac.count <- dba.count(atac.list)
atac.contrast <- dba.contrast(atac.count, categories=DBA_CONDITION, block=DBA_REPLICATE, minMembers = 2)
atac.analyze <- dba.analyze(atac.contrast, method=DBA_ALL_METHODS)
###try to see the normalized signal values for a peak
dds <-atac.analyze$contrasts[[1]]$DESeq2$block$DEdata
des2Report <- HTMLReport(shortName = "Differential_analysis_screening_library",
title = 'differential analysis of screening library using DESeq2', reportDirectory ="./DESeq2/reports")
publish(dds,des2Report, pvalueCutoff=0.05,annotation.db=NULL,
factor = colData(dds)$facs, reportDir="./DESeq2/reports")
finish(des2Report)

To expand on Rory's answer: I should note that it is easy to achieve library size normalization in edgeR, simply by not performing TMM normalization, i.e., not running
calcNormFactors. There's nothing mandatory about runningcalcNormFactorsin order to use edgeR, it just happens to be what we do to normalize RNA-seq data. In fact, you can (and I often do) put other normalization factors in there, optimized for a given type of data, e.g., Hi-C or ChIP-seq.Hey Rory,
Thank you so much for your help!
I made three MA plots according to your suggestion. I think it is a little bit tricky to tell if the non-normalized plot is above/below the zero line. Do you mind taking a look of the three graphs? I can email those to you.
Best,
Ergang
Yes, I can look, send them to me (or post them here).