Question: DiffBind :Retrieve count value from Diffbind by
gravatar for JunLVI
14 months ago by
JunLVI40 wrote:


I have questions about "",

I applied "DiffBind" to analyze my dataset like:


H3K27ac_CellA_CellD <- dba(sampleSheet="sampleconfiguration.csv",)
H3K27ac_CellA_CellD_Count <- dba.count(H3K27ac_CellA_CellD, score= DBA_SCORE_TMM_READS_EFFECTIVE, mapQCth=0, bRemoveDuplicates=FALSE)
H3K27ac_CellA_CellD_Diff <- dba.analyze(H3K27ac_CellA_CellD_Count,method=DBA_DESEQ2)
H3K27ac_CellA_CellD_Diff_All <-,method=DBA_DESEQ2, th=1)
H3K27ac_All <- data.frame (Conc_WT=H3K27ac_CellA_CellD_Diff_All @elementMetadata$Conc_WT,
                           Conc_GeneXcKO= H3K27ac_CellA_CellD_Diff_All @elementMetadata$Conc_GeneXcKO)

H3K27ac_CellA_CellD_Diff_All  @elementMetadata
DataFrame with 6523 rows and 6 columns
          Conc Conc_GeneXcKO   Conc_WT      Fold   p-value       FDR
     <numeric>    <numeric> <numeric> <numeric> <numeric> <numeric>
1         4.77        -0.51      5.75     -6.26  6.51e-11  4.11e-07
2         4.41         0.27      5.37     -5.11  2.55e-08  8.06e-05
6523      3.88         4.06      3.67      0.39         1         1

I was wondering how these values are calculated:

Conc Conc_GeneXcKO Conc_WT

since DESEQ2 was utilized to perform the differential analysis, does this means the "regularized log transformation" "DESeq2"  was applied to the raw count value?  Or in my case above, simple log transformation of "TMM_READS_EFFECTIVE"?

Either case, it seems that values were "normalized" one way or another.

when I plotted the "H3K27ac_All" containing the "Conc_GeneXcKO Conc_WT" in my case,

p <- ggplot(H3K27ac_All, aes(Conc_WT, Conc_GeneXcKO))
p + geom_point() +
    geom_abline(intercept = 0, slope = 1, linetype=2) +
    scale_x_continuous(name="Epifeatures\n in WT Cell ", limits = c(-1, 11)) +
    scale_y_continuous(name="Epifeatures \n in KO Cell ",limits = c(-1, 11)) ​

I expected the dots distributed along the line with slope of 1, somehow the results turned out to be different.

Could this be a result of improper data process? 

I just noticed some related post DiffBind: Normalization for DESeq2 and DiffBind output: Is "concentration" normalized?:

so it seems to me that in my case, the "conc" is log transformed counts  normalized by library size. 

or set the lib size as "sizeFactor" in "DESeq2"

"DESeq2::sizeFactors(DESeqDataSeq) <- libsize/min(libsize)" 

just make sure, is log transformation here equivalent to "rlog" in "DEseq2"?




ADD COMMENTlink modified 14 months ago by Rory Stark2.5k • written 14 months ago by JunLVI40
gravatar for Rory Stark
14 months ago by
Rory Stark2.5k
CRUK, Cambridge, UK
Rory Stark2.5k wrote:

The transform is a little different than rlog. The read counts are, as you saw above, normalized by the minimum library size. The concentrations are the log2() of the mean of these values.

You can do a "pure" DESeq2 normalization by setting bFulLibrarySize=FALSE in the call to dba.analyze(). This will only use the reads within peaks to calculate the normalisation factors. 

I can think of a couple of things you may want to look at to see what is going on. One is to retrieve the read counts in the report by setting bCounts=TRUE in the call to (You can get these all in a data frame by setting DataType=DBA_DATA_FRAME in the call as well). You can retrieve normalized or "raw" read counts by changing the value of the bNormalized parameter -- it can be interesting to compare "raw" and normalized counts. Finally, getting an MA plot would be informative to see if the shift from the diagonal seems to be "real" or not. You can call dba.plotMA() with bNormalized=FALSE to see the raw distribution, and you can also compare the impact of setting bFullLibrarySize=FALSE to get a more complete picture of how your reads are distributed and the effects of normalization.


ADD COMMENTlink written 14 months ago by Rory Stark2.5k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 229 users visited in the last hour