Dear friends,
While I was using the latest release (3.13), I have met a few problems which need your help.
I have repeated the whole process following the vignettes using the built in data sets, but my final results (I got 246 significantly differentially binding sites, see the attached file) are different from numbers (249) provided by the vignettes .
Those 246 sites were produced by the Deseq2 method (which is the default one), but if I choose the method = DBA_EDGER while doing the analyze step, I got 0 significantly differentially binding sites which is quite confusing. I do not know what is going wrong here (see the attached file)?
As for the normalization step, which is newly added, cause the old version comes with no normalization function. So the normalization step is necessary ? As Deseq2 or edgR is has their own way of normalizing data, so should we normalize the data first (using dba.normalize()) and then normalize the data again using Deseq2 or edgeR (using dba.analyze())? Or the dba.analyze() adopted the normalization (i.e. size factors) from dba.normalize() and then continues the following normalization.
Many thanks for your kind help!
Code as shown below
> tamoxifen
11 Samples, 2845 sites in matrix:
ID Tissue Factor Condition Treatment Replicate Reads FRiP
1 BT4741 BT474 ER Resistant Full-Media 1 652697 0.16
2 BT4742 BT474 ER Resistant Full-Media 2 663370 0.15
3 MCF71 MCF7 ER Responsive Full-Media 1 346429 0.31
4 MCF72 MCF7 ER Responsive Full-Media 2 368052 0.19
5 MCF73 MCF7 ER Responsive Full-Media 3 466273 0.25
6 T47D1 T47D ER Responsive Full-Media 1 399879 0.11
7 T47D2 T47D ER Responsive Full-Media 2 1475415 0.06
8 MCF7r1 MCF7 ER Resistant Full-Media 1 616630 0.22
9 MCF7r2 MCF7 ER Resistant Full-Media 2 593224 0.14
10 ZR751 ZR75 ER Responsive Full-Media 1 706836 0.33
11 ZR752 ZR75 ER Responsive Full-Media 2 2575408 0.22
> plot(tamoxifen)
> #3.Normalizing the data
> tamoxifen <- dba.normalize(tamoxifen)
> tamoxifen
11 Samples, 2845 sites in matrix:
ID Tissue Factor Condition Treatment Replicate Reads FRiP
1 BT4741 BT474 ER Resistant Full-Media 1 652697 0.16
2 BT4742 BT474 ER Resistant Full-Media 2 663370 0.15
3 MCF71 MCF7 ER Responsive Full-Media 1 346429 0.31
4 MCF72 MCF7 ER Responsive Full-Media 2 368052 0.19
5 MCF73 MCF7 ER Responsive Full-Media 3 466273 0.25
6 T47D1 T47D ER Responsive Full-Media 1 399879 0.11
7 T47D2 T47D ER Responsive Full-Media 2 1475415 0.06
8 MCF7r1 MCF7 ER Resistant Full-Media 1 616630 0.22
9 MCF7r2 MCF7 ER Resistant Full-Media 2 593224 0.14
10 ZR751 ZR75 ER Responsive Full-Media 1 706836 0.33
11 ZR752 ZR75 ER Responsive Full-Media 2 2575408 0.22
> tamoxifen <- dba.contrast(tamoxifen)
Computing results names...
> tamoxifen
11 Samples, 2845 sites in matrix:
ID Tissue Factor Condition Treatment Replicate Reads FRiP
1 BT4741 BT474 ER Resistant Full-Media 1 652697 0.16
2 BT4742 BT474 ER Resistant Full-Media 2 663370 0.15
3 MCF71 MCF7 ER Responsive Full-Media 1 346429 0.31
4 MCF72 MCF7 ER Responsive Full-Media 2 368052 0.19
5 MCF73 MCF7 ER Responsive Full-Media 3 466273 0.25
6 T47D1 T47D ER Responsive Full-Media 1 399879 0.11
7 T47D2 T47D ER Responsive Full-Media 2 1475415 0.06
8 MCF7r1 MCF7 ER Resistant Full-Media 1 616630 0.22
9 MCF7r2 MCF7 ER Resistant Full-Media 2 593224 0.14
10 ZR751 ZR75 ER Responsive Full-Media 1 706836 0.33
11 ZR752 ZR75 ER Responsive Full-Media 2 2575408 0.22
Design: [~Condition] | 1 Contrast:
Factor Group Samples Group2 Samples2
1 Condition Responsive 7 Resistant 4
> tamoxifen <- dba.analyze(tamoxifen)
Applying Blacklist/Greylists...
Genome detected: Hsapiens.UCSC.hg19
Applying blacklist...
Removed: 1 of 2845 intervals.
Counting control reads for greylist...
Building greylist: C:/Users/Tao/Documents/Test/DiffBind_Vignette/reads/Chr18_BT474_input.bam
coverage: 166912 bp (0.21%)
Building greylist: C:/Users/Tao/Documents/Test/DiffBind_Vignette/reads/Chr18_MCF7_input.bam
coverage: 106495 bp (0.14%)
Building greylist: C:/Users/Tao/Documents/Test/DiffBind_Vignette/reads/Chr18_T47D_input.bam
coverage: 56832 bp (0.07%)
Building greylist: C:/Users/Tao/Documents/Test/DiffBind_Vignette/reads/Chr18_TAMR_input.bam
coverage: 122879 bp (0.16%)
Building greylist: C:/Users/Tao/Documents/Test/DiffBind_Vignette/reads/Chr18_ZR75_input.bam
coverage: 68608 bp (0.09%)
BT474c: 58 ranges, 166912 bases
MCF7c: 14 ranges, 106495 bases
T47Dc: 11 ranges, 56832 bases
TAMRc: 10 ranges, 122879 bases
ZR75c: 12 ranges, 68608 bases
Master greylist: 69 ranges, 251391 bases
Removed: 50 of 2844 intervals.
Removed 51 (of 2845) consensus peaks.
Normalize DESeq2 with defaults...
Forming default model design and contrast(s)...
Computing results names...
Analyzing...
gene-wise dispersion estimates
mean-dispersion relationship
final dispersion estimates
> dba.show(tamoxifen, bContrasts=TRUE)
Factor Group Samples Group2 Samples2 DB.DESeq2
1 Condition Responsive 7 Resistant 4 249
> tamoxifen <- dba.analyze(tamoxifen,method = DBA_EDGER)
Normalize edgeR with defaults...
Analyzing...
> dba.show(tamoxifen, bContrasts=TRUE)
Factor Group Samples Group2 Samples2 DB.edgeR DB.DESeq2
1 Condition Responsive 7 Resistant 4 0 249
sessionInfo( )