Can I completely repeat the DiffBind results using DESeq2?
Entering edit mode
Last seen 8 months ago

Hi, Dr Stark.

I want to do some interactions work when analysising ChIP-Seq or ATAC-Seq. So I decide to use the DiffBind count data as the DESeq2 input, and then use DESeq2 to get more complex design and to calculate FoldChange and pvalue. It seems I can perfectly repeat the log2FoldChange and paritially repeat pvalue. I have known some issues about DiffBind using DESeq2, and I am wondering whether you can give me some another suggestion about repeat DiffBind results using DESeq2.

The code producing count data is list below

dba_meta <- dba(minOverlap = 1, sampleSheet = sample_info)
dba_count <- dba.count(dba_meta,minOverlap = 1)

peak_count_list <- dba_count$peaks
names(peak_count_list) <- dba_count$samples$SampleID

Peak_count <- lapply(peak_count_list, function(x) {x$Reads})
ATAC_Peak_count <-, Peak_count)
rownames(ATAC_Peak_count) <- paste0(experiment_name,"_",1:dim(ATAC_Peak_count)[1])

Then I get coldata

coldata <- data.frame(row.names = colnames(ATAC_Peak_count), 
                      type_time = gsub("_R[12]", "", colnames(ATAC_Peak_count))

# dds
dds <- DESeqDataSetFromMatrix(countData = ATAC_Peak_count, 
                              colData = coldata,
                              design = ~ type_time)

I noticed DiffBind will use the bFullLibrarySize=TURE, So I replace the sizeFactors with bamlibSize

reads_number <- as.numeric(dba_count$class["Reads", ])
new_SF <- reads_number / min(reads_number)
sizeFactors(dds) <- new_SF

And I also found this line in vignette

estimateDispersionsis then called with theDESeqDataSetobject andfitTypeset tolocal. Next the model is fitted and tested usingnbinomWaldTest. The final results (as aDESeqDataSet) are accessible within theDBAobject as

So I

estimateDispersions(dds, fitType = "local")
dds <- nbinomWaldTest(dds)
dds_results <- results(dds)
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS:   /opt/sysoft/R-3.6.1/lib64/R/lib/
LAPACK: /opt/sysoft/R-3.6.1/lib64/R/lib/

 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods  
[9] base     

other attached packages:
 [1] DESeq2_1.26.0               DiffBind_2.14.0             SummarizedExperiment_1.16.0
 [4] DelayedArray_0.12.0         BiocParallel_1.19.6         matrixStats_0.55.0         
 [7] Biobase_2.46.0              GenomicRanges_1.38.0        GenomeInfoDb_1.22.0        
[10] IRanges_2.20.0              S4Vectors_0.24.0            BiocGenerics_0.32.0        

loaded via a namespace (and not attached):
  [1] amap_0.8-18              colorspace_1.4-1         rjson_0.2.20            
  [4] hwriter_1.3.2            htmlTable_1.13.3         XVector_0.26.0          
  [7] base64enc_0.1-3          rstudioapi_0.10          ggrepel_0.8.1           
 [10] bit64_0.9-7              AnnotationDbi_1.48.0     splines_3.6.1           
 [13] geneplotter_1.64.0       knitr_1.26               zeallot_0.1.0           
 [16] Formula_1.2-3            Rsamtools_2.2.3          packrat_0.5.0           
 [19] annotate_1.64.0          cluster_2.1.0            GO.db_3.10.0            
 [22] dbplyr_1.4.2             png_0.1-7                pheatmap_1.0.12         
 [25] graph_1.64.0             compiler_3.6.1           httr_1.4.1              
 [28] GOstats_2.52.0           backports_1.1.5          assertthat_0.2.1        
 [31] Matrix_1.2-18            lazyeval_0.2.2           limma_3.42.0            
 [34] htmltools_0.4.0          acepack_1.4.1            prettyunits_1.0.2       
 [37] tools_3.6.1              gtable_0.3.0             glue_1.3.1              
 [40] GenomeInfoDbData_1.2.2   Category_2.52.0          systemPipeR_1.20.0      
 [43] dplyr_0.8.3              batchtools_0.9.11        rappdirs_0.3.1          
 [46] ShortRead_1.44.0         Rcpp_1.0.3               vctrs_0.2.1             
 [49] Biostrings_2.54.0        gdata_2.18.0             rtracklayer_1.46.0      
 [52] xfun_0.11                stringr_1.4.0            lifecycle_0.1.0         
 [55] gtools_3.8.1             XML_3.98-1.20            edgeR_3.28.0            
 [58] zlibbioc_1.32.0          scales_1.1.0             BSgenome_1.54.0         
 [61] VariantAnnotation_1.32.0 hms_0.5.2                RBGL_1.62.0             
 [64] RColorBrewer_1.1-2       yaml_2.2.0               curl_4.3                
 [67] gridExtra_2.3            memoise_1.1.0            ggplot2_3.2.1           
 [70] rpart_4.1-15             biomaRt_2.42.0           latticeExtra_0.6-29     
 [73] stringi_1.4.3            RSQLite_2.1.5            genefilter_1.68.0       
 [76] checkmate_1.9.4          GenomicFeatures_1.38.0   caTools_1.17.1.3        
 [79] rlang_0.4.2              pkgconfig_2.0.3          bitops_1.0-6            
 [82] lattice_0.20-38          purrr_0.3.3              htmlwidgets_1.5.1       
 [85] GenomicAlignments_1.22.0 bit_1.1-14               tidyselect_0.2.5        
 [88] GSEABase_1.48.0          AnnotationForge_1.28.0   magrittr_1.5            
 [91] R6_2.4.1                 gplots_3.0.1.1           Hmisc_4.3-0             
 [94] base64url_1.4            DBI_1.1.0                foreign_0.8-73          
 [97] pillar_1.4.3             withr_2.1.2              nnet_7.3-12             
[100] survival_3.1-8           RCurl_1.95-4.12          tibble_2.1.3            
[103] crayon_1.3.4             KernSmooth_2.23-16       BiocFileCache_1.10.0    
[106] jpeg_0.1-8.1             progress_1.2.2           locfit_1.5-9.1          
[109] grid_3.6.1               data.table_1.12.6        blob_1.2.0              
[112] Rgraphviz_2.30.0         digest_0.6.23            xtable_1.8-4            
[115] brew_1.0-6               openssl_1.4.1            munsell_0.5.0           
[118] askpass_1.1             
diffbind • 446 views
Entering edit mode
Rory Stark ★ 4.4k
Last seen 1 day ago
CRUK, Cambridge, UK

Looks like you're reproducing this pretty well, not sure exactly what is causing the difference.

The upcoming release of DiffBind will support the ability to include arbitrary designs and complex contrasts, made up of any of the metadata factors Tissue, Factor, Condition, Treatment, Replicate, and Caller. It also has a simple interface to retrieve the DESeq2 DESeqDataSet (or the edgeR DGEList) object.

Entering edit mode

This functionality has now been released in the current version.


Login before adding your answer.

Traffic: 181 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6