DiffBind broad and narrow peak identification
1
0
Entering edit mode
@reubenmcgregor88-13722
Last seen 9 months ago

Hi, 

I have histone, broad-peak data, however some of the peaks are actually rather narrow. So what I can see happening is that the broader peaks are being nicely identified as differentially expressed by diffbind, but the narrow peaks which seem highly up-regulated (with big fold changes) are not reaching FDR significance. My interpretation is that there are less counts compared to broad peaks.

The code I used for differential peak analysis is as follows:

> samples <- dba(sampleSheet=diffbind_samples, peakCaller = "bed", peakFormat = "bed")
> samples_counts <- dba.count(samples)
> samples_counts <- dba.contrast(samples_counts, categories=DBA_CONDITION, minMembers=2)
> samples_counts <- dba.analyze(samples_counts)

I wonder whether this is something you have come across before? I have tried playing around with "summits" function and "bFullLibrarySize" but not sure how I can best tackle this problem? To give you an idea here is the readout for one of the narrow peaks in question:

Chr Start End Conc Conc_vitd Conc_eth Fold p-value FDR Called1 Called2 k27acvitd1 k27acvitd2 k27aceth1 k27aceth2  
chr2 102918383 102918857 7.02 7.79 5.23 2.55 0.00444 0.513 2 1 217.33 225.1 69.3 6  

So the eth condition goes from 69.3 and 6 to 217.33 and 225.1

and this is one of the broader peaks which is identified as differentially expressed:

Chr Start End Conc Conc_vitd Conc_eth Fold p-value FDR Called1 Called2 k27acvitd1 k27acvitd2 k27aceth1 k27aceth2
chr2 204693199 204696874 9.14 9.92 7.33 2.59 3.60E-06 0.00245 2 2 935.67 1004.41 256.79 65

 Here is my sample sheet:

SampleID Tissue Factor Condition Treatment Replicate bamReads ControlID bamControl Peaks PeakCaller
                       

1

k27acvitd1

cd4

k27ac

vitd

a

1

Bowtie_1_27Ac_VitD.bam

natinput

Bowtie_1_NInput_Eth.bam

27ac.vitD.don1.peaks.styleregion.250bp.mindist5kb_diffbind.bed

bed

2

k27acvitd2

cd4

k27ac

vitd

a

2

Bowtie_2_27Ac_VitD.bam

natinput

Bowtie_1_NInput_Eth.bam

27ac.vitD.don2.peaks.styleregion.250bp.mindist5kb.diffbind.bed

bed

3

k27aceth1

cd4

k27ac

eth

a

1

Bowtie_1_27Ac_Eth.bam

natinput

Bowtie_1_NInput_Eth.bam

27ac.eth.don1.peaks.styleregion.250bp.mindist5kb_diffbind.bed

bed

4

k27aceth2

cd4

k27ac

eth

a

2

Bowtie_2_27Ac_Eth.bam

natinput

Bowtie_1_NInput_Eth.bam

27ac.eth.don2.peaks.styleregion.250bp.mindist5kb.diffbind.bed

bed

sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Sierra 10.12.6

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] readr_1.1.1                DiffBind_2.4.8             SummarizedExperiment_1.6.3
 [4] DelayedArray_0.2.7         matrixStats_0.52.2         Biobase_2.36.2            
 [7] GenomicRanges_1.28.4       GenomeInfoDb_1.12.2        IRanges_2.10.2            
[10] S4Vectors_0.14.3           BiocGenerics_0.22.0       

loaded via a namespace (and not attached):
  [1] Category_2.42.1          bitops_1.0-6             bit64_0.9-7              RColorBrewer_1.1-2      
  [5] httr_1.3.1               tools_3.4.1              backports_1.1.0          R6_2.2.2                
  [9] rpart_4.1-11             KernSmooth_2.23-15       Hmisc_4.0-3              DBI_0.7                 
 [13] lazyeval_0.2.0           colorspace_1.3-2         nnet_7.3-12              gridExtra_2.2.1         
 [17] DESeq2_1.16.1            bit_1.1-12               compiler_3.4.1           sendmailR_1.2-1         
 [21] graph_1.54.0             htmlTable_1.9            plotly_4.7.1             rtracklayer_1.36.4      
 [25] caTools_1.17.1           scales_0.5.0             checkmate_1.8.3          BatchJobs_1.6           
 [29] genefilter_1.58.1        RBGL_1.52.0              stringr_1.2.0            digest_0.6.12           
 [33] Rsamtools_1.28.0         foreign_0.8-69           AnnotationForge_1.18.1   XVector_0.16.0          
 [37] base64enc_0.1-3          pkgconfig_2.0.1          htmltools_0.3.6          limma_3.32.5            
 [41] htmlwidgets_0.9          rlang_0.1.2              RSQLite_2.0              BBmisc_1.11             
 [45] GOstats_2.42.0           bindr_0.1                hwriter_1.3.2            jsonlite_1.5            
 [49] BiocParallel_1.10.1      gtools_3.5.0             acepack_1.4.1            dplyr_0.7.2             
 [53] RCurl_1.95-4.8           magrittr_1.5             GO.db_3.4.1              GenomeInfoDbData_0.99.0 
 [57] Formula_1.2-2            Matrix_1.2-11            Rcpp_0.12.12             munsell_0.4.3           
 [61] stringi_1.1.5            edgeR_3.18.1             zlibbioc_1.22.0          fail_1.3                
 [65] gplots_3.0.1             plyr_1.8.4               grid_3.4.1               blob_1.1.0              
 [69] ggrepel_0.6.5            gdata_2.18.0             lattice_0.20-35          Biostrings_2.44.2       
 [73] splines_3.4.1            GenomicFeatures_1.28.4   annotate_1.54.0          hms_0.3                 
 [77] locfit_1.5-9.1           knitr_1.17               rjson_0.2.15             systemPipeR_1.10.2      
 [81] geneplotter_1.54.0       biomaRt_2.32.1           XML_3.98-1.9             glue_1.1.1              
 [85] ShortRead_1.34.0         latticeExtra_0.6-28      data.table_1.10.4        gtable_0.2.0            
 [89] purrr_0.2.3              tidyr_0.7.0              amap_0.8-14              assertthat_0.2.0        
 [93] ggplot2_2.2.1            xtable_1.8-2             survival_2.41-3          viridisLite_0.2.0       
 [97] pheatmap_1.0.8           tibble_1.3.4             GenomicAlignments_1.12.2 AnnotationDbi_1.38.2    
[101] memoise_1.1.0            bindrcpp_0.2             cluster_2.0.6            brew_1.0-6              
[105] GSEABase_1.38.1   

 
diffbind r chipseq • 1.2k views
ADD COMMENT
1
Entering edit mode
Rory Stark ★ 4.1k
@rory-stark-5741
Last seen 22 days ago
CRUK, Cambridge, UK

In the example you show, while the fold-change difference is similar, there is quite a bit more variance in the second sample group, so it may not be surprising that it gets a higher p-value that gets more drastically adjusted by the FDR correction.

You mention trying the summits option in dba.count(), how did that change things? That's what I would try -- see my recent discussion of this here: A: How to select summit size for histone ChiP-Seq in Diffbind

-Rory

ADD COMMENT

Login before adding your answer.

Traffic: 348 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6