Question: diffbind counts as same values
0
gravatar for chadnewton
4 weeks ago by
chadnewton0
chadnewton0 wrote:

I have run ATAC-seq on 4 samples (2 groups with 2 samples each) and I am wanting to differential peak calling between groups but I noticed that when I generate a report of my analysis with dba.report() two of the samples have the exact same count values across all peaks (6.61 for both c1s and c3r). the samples are grouped by condition (slow vs rapid). For this experiment c1s and c4s are in the "slow" condition and c2r and c3r are in the "rapid" condition.

dat.count <- dba.count(dat, minOverlap = 0, score = DBA_SCORE_RPKM, bUseSummarizeOverlaps = FALSE,  filterFun = max, bRemoveDuplicates = FALSE, bScaleControl = FALSE

dat.con <- dba.contrast(dat.count, categories = DBA_CONDITION, minMembers = 2)

dat.an <- dba.analyze(dat.con, method = c(DBA_EDGER), bTagwise = FALSE, bFullLibrarySize = FALSE)

DB <- dba.reportdat.an, method = DBA_EDGER, th = 0.05, bUsePval = F, fold = 0, bNormalized = TRUE,
                    bCalled = TRUE, bCalledDetail = FALSE, bCounts = TRUE, precision = 0,
                    file = "cn1.a.reads", ext = ".csv" )
df = as.data.frame(DB)
head(df)

      seqnames    start      end width strand     Conc Conc_slow Conc_rapid      Fold      p.value
72377     chr22 24372720 24373588   869      * 4.978275  1.762114   5.898487 -4.136373 3.585008e-13
111971     chr8  8085285  8086444  1160      * 5.105623  1.832666   6.028998 -4.196332 1.659267e-12
127355     chrY 15016435 15017735  1301      * 4.340530  1.762114   5.214402 -3.452288 3.263057e-11
127389     chrY 22737178 22738068   891      * 4.318821  1.762114   5.190694 -3.428580 3.795535e-11
44922     chr16 90061131 90062608  1478      * 5.746760  2.463868   6.670675 -4.206807 4.227431e-10
127273     chrY  2802939  2804293  1355      * 3.878970  1.762114   4.702272 -2.940158 8.061298e-10
                FDR Called1 Called2  c1s  c4s    c2r  c3r
72377  4.567694e-08       0       1 6.61 0.17 112.69 6.61
111971 1.057044e-07       1       1 6.61 0.51 123.98 6.61
127355 1.208982e-06       1       2 6.61 0.17  67.64 6.61
127389 1.208982e-06       1       2 6.61 0.17  66.43 6.61
44922  1.077242e-05       1       2 6.61 4.42 197.14 6.61
127273 1.609109e-05       1       2 6.61 0.17  45.45 6.61

Can you tell me why the values for c1s and c3r are all the same? Is this a problem with my input files (bamReads are sorted .bam files and Peaks are narrowPeak files from MACS2)? Or is this a problem with the normalization? I tried using different scores (TMM, etc) but the problem persists.

Any help is appreciated!

ADD COMMENTlink modified 13 days ago by Rory Stark2.8k • written 4 weeks ago by chadnewton0
Answer: diffbind counts as same values
0
gravatar for Rory Stark
13 days ago by
Rory Stark2.8k
CRUK, Cambridge, UK
Rory Stark2.8k wrote:

I suspect this is a normalization issue, where the normalized read values are all the same. You can check this by setting bNormalized=FALSE in the call to dba.report() to see the "raw" read values.

I also see that in the call dba.count(), you have set bScaleControl=FALSE. As you are subtracting control reads by default in the call to dba.analyze(), it may be that the number of control reads is greater than the number of ChIP reads in certain samples; these will all be set to 1 prior to normalization. You can turn off the subtraction of control reads by setting bSubControl=FALSE. You can also try using the DESeq2 analysis method, which will deploy a different, and less corrective, normali`ation scheme.

ADD COMMENTlink written 13 days ago by Rory Stark2.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 184 users visited in the last hour