DiffBind :Retrieve count value from Diffbind by dab.report
1
0
Entering edit mode
JunLVI ▴ 40
@junlvi-8996
Last seen 6.4 years ago
Japan

 

I have questions about "dba.report",

I applied "DiffBind" to analyze my dataset like:

 

H3K27ac_CellA_CellD <- dba(sampleSheet="sampleconfiguration.csv",)
H3K27ac_CellA_CellD_Count <- dba.count(H3K27ac_CellA_CellD, score= DBA_SCORE_TMM_READS_EFFECTIVE, mapQCth=0, bRemoveDuplicates=FALSE)
H3K27ac_CellA_CellD_Diff <- dba.analyze(H3K27ac_CellA_CellD_Count,method=DBA_DESEQ2)
H3K27ac_CellA_CellD_Diff_All <- dba.report(H3K27ac_CellA_CellD_Diff,method=DBA_DESEQ2, th=1)
H3K27ac_All <- data.frame (Conc_WT=H3K27ac_CellA_CellD_Diff_All @elementMetadata$Conc_WT,
                           Conc_GeneXcKO= H3K27ac_CellA_CellD_Diff_All @elementMetadata$Conc_GeneXcKO)

H3K27ac_CellA_CellD_Diff_All  @elementMetadata
DataFrame with 6523 rows and 6 columns
          Conc Conc_GeneXcKO   Conc_WT      Fold   p-value       FDR
     <numeric>    <numeric> <numeric> <numeric> <numeric> <numeric>
1         4.77        -0.51      5.75     -6.26  6.51e-11  4.11e-07
2         4.41         0.27      5.37     -5.11  2.55e-08  8.06e-05
...
6523      3.88         4.06      3.67      0.39         1         1

I was wondering how these values are calculated:

Conc Conc_GeneXcKO Conc_WT

since DESEQ2 was utilized to perform the differential analysis, does this means the "regularized log transformation" "DESeq2"  was applied to the raw count value?  Or in my case above, simple log transformation of "TMM_READS_EFFECTIVE"?

Either case, it seems that values were "normalized" one way or another.

when I plotted the "H3K27ac_All" containing the "Conc_GeneXcKO Conc_WT" in my case,

p <- ggplot(H3K27ac_All, aes(Conc_WT, Conc_GeneXcKO))
p + geom_point() +
    geom_abline(intercept = 0, slope = 1, linetype=2) +
    scale_x_continuous(name="Epifeatures\n in WT Cell ", limits = c(-1, 11)) +
    scale_y_continuous(name="Epifeatures \n in KO Cell ",limits = c(-1, 11)) ​

I expected the dots distributed along the line with slope of 1, somehow the results turned out to be different.

Could this be a result of improper data process? 

I just noticed some related post DiffBind: Normalization for DESeq2 and DiffBind dba.report() output: Is "concentration" normalized?:

so it seems to me that in my case, the "conc" is log transformed counts  normalized by library size. 

or set the lib size as "sizeFactor" in "DESeq2"

"DESeq2::sizeFactors(DESeqDataSeq) <- libsize/min(libsize)" 

just make sure, is log transformation here equivalent to "rlog" in "DEseq2"?

 

 

 

diffbind • 1.2k views
ADD COMMENT
0
Entering edit mode
Rory Stark ★ 5.2k
@rory-stark-5741
Last seen 10 weeks ago
Cambridge, UK

The transform is a little different than rlog. The read counts are, as you saw above, normalized by the minimum library size. The concentrations are the log2() of the mean of these values.

You can do a "pure" DESeq2 normalization by setting bFulLibrarySize=FALSE in the call to dba.analyze(). This will only use the reads within peaks to calculate the normalisation factors. 

I can think of a couple of things you may want to look at to see what is going on. One is to retrieve the read counts in the report by setting bCounts=TRUE in the call to dba.report(). (You can get these all in a data frame by setting DataType=DBA_DATA_FRAME in the call as well). You can retrieve normalized or "raw" read counts by changing the value of the bNormalized parameter -- it can be interesting to compare "raw" and normalized counts. Finally, getting an MA plot would be informative to see if the shift from the diagonal seems to be "real" or not. You can call dba.plotMA() with bNormalized=FALSE to see the raw distribution, and you can also compare the impact of setting bFullLibrarySize=FALSE to get a more complete picture of how your reads are distributed and the effects of normalization.

-Rory

ADD COMMENT

Login before adding your answer.

Traffic: 596 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6