DESeq2 lfcShrink function used in DiffBind package
1
0
Entering edit mode
@aaed3153
Last seen 16 months ago
United States

Hello,

We have basically three questions that revolve around the DESeq2 lfcShrink function which is used by DiffBind.

  1. We have Cut&Tag samples and want to conduct differential binding analysis. Our main objective is to compare two samples and see if there are any major peaks that are different. Do you think DiffBind is suitable to conduct such an analysis?
  2. In the lfcShrink function that is used by DESeq2, the default algorithm that is used is "aplegm", while the lfcShrink function in DiffBind uses ashr. Is this right? and if so, what is the reason for this? It seems that aplegm is more suited for RNA-seq data while ashr is for other sequencing data. Could you offer more insight as to why ashr is used by DiffBind?
  3. For some of the pairwise DiffBind results, we found some very interesting results - With DiffBind, the log2FoldChange for one of the peaks is shown to be 0.00115746 On the other hand, when we used DESeq2 and used aplegm model in the lfcShrink function, the log2FoldChange was shown to be 1.82097 We are curious as to why would changing the model result in such a drastic difference in the value of the log2FoldChange.

Thank you so much for reading and any help and insight is highly appreciated. Please let me know if you need any more information.

FoldChange DiffBind lfcShrink Cut&TagData • 1.2k views
ADD COMMENT
2
Entering edit mode
ATpoint ★ 4.6k
@atpoint-13662
Last seen 2 days ago
Germany

The DiffBind author will give his own answer, but I can tell about some general aspects:

1) Yes, CUT&(Run/Tag) in the end is an integer count matrix that you can analyse by the common tools such as DESeq2, edgeR and limma, and this (the first two) is what DiffBind uses.

2) The ashr method has the advantage that it is more generic as it can use contrasts while apeglm needs coefficients, so for apeglm one might need to relevel the factors and rerun the Wald test to test all possible combinations in a pairwise fashion. While not difficult to implement (https://www.biostars.org/p/448959/) ashr is more straight forward. The DESeq2 author routinely recommends both here on the support site so choice is yours.

3) These shrinkage methods have their edge cases. I asked something similar before (apeglm -- influence of reference level on logFC and svalues) where even the order of factor levels can make a difference for some genes/peaks. While this is indeed not really satisfying (especially if one of these edgy peaks is critically-linked to a gene important for your research), I would stick to one analysis strategy to avoid confirmation bias and entering the so-called garden of forked paths.

ADD COMMENT
2
Entering edit mode

I endorse all of ATpoint's comments for this question!

ADD REPLY
0
Entering edit mode

Thank you Rory Stark!

ADD REPLY
0
Entering edit mode

Thank you so much for the detailed answers to each question ATpoint !

ADD REPLY

Login before adding your answer.

Traffic: 488 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6