Can I completely repeat the DiffBind results using DESeq2?
1
0
Entering edit mode
@shangguandong1996-21805
Last seen 17 months ago
China

Hi, Dr Stark.

I want to do some interactions work when analysising ChIP-Seq or ATAC-Seq. So I decide to use the DiffBind count data as the DESeq2 input, and then use DESeq2 to get more complex design and to calculate FoldChange and pvalue. It seems I can perfectly repeat the log2FoldChange and paritially repeat pvalue. I have known some issues about DiffBind using DESeq2, and I am wondering whether you can give me some another suggestion about repeat DiffBind results using DESeq2.

The code producing count data is list below

dba_meta <- dba(minOverlap = 1, sampleSheet = sample_info)
dba_count <- dba.count(dba_meta,minOverlap = 1)

peak_count_list <- dba_count$peaks
names(peak_count_list) <- dba_count$samples$SampleID

Peak_count <- lapply(peak_count_list, function(x) {x$Reads})
ATAC_Peak_count <- do.call(cbind, Peak_count)
rownames(ATAC_Peak_count) <- paste0(experiment_name,"_",1:dim(ATAC_Peak_count)[1])

Then I get coldata

coldata <- data.frame(row.names = colnames(ATAC_Peak_count), 
                      type_time = gsub("_R[12]", "", colnames(ATAC_Peak_count))
                      )

# dds
dds <- DESeqDataSetFromMatrix(countData = ATAC_Peak_count, 
                              colData = coldata,
                              design = ~ type_time)

I noticed DiffBind will use the bFullLibrarySize=TURE, So I replace the sizeFactors with bamlibSize

reads_number <- as.numeric(dba_count$class["Reads", ])
new_SF <- reads_number / min(reads_number)
sizeFactors(dds) <- new_SF

And I also found this line in vignette

estimateDispersionsis then called with theDESeqDataSetobject andfitTypeset tolocal. Next the model is fitted and tested usingnbinomWaldTest. The final results (as aDESeqDataSet) are accessible within theDBAobject as

So I

estimateDispersions(dds, fitType = "local")
dds <- nbinomWaldTest(dds)
dds_results <- results(dds)
 sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS:   /opt/sysoft/R-3.6.1/lib64/R/lib/libRblas.so
LAPACK: /opt/sysoft/R-3.6.1/lib64/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods  
[9] base     

other attached packages:
 [1] DESeq2_1.26.0               DiffBind_2.14.0             SummarizedExperiment_1.16.0
 [4] DelayedArray_0.12.0         BiocParallel_1.19.6         matrixStats_0.55.0         
 [7] Biobase_2.46.0              GenomicRanges_1.38.0        GenomeInfoDb_1.22.0        
[10] IRanges_2.20.0              S4Vectors_0.24.0            BiocGenerics_0.32.0        

loaded via a namespace (and not attached):
  [1] amap_0.8-18              colorspace_1.4-1         rjson_0.2.20            
  [4] hwriter_1.3.2            htmlTable_1.13.3         XVector_0.26.0          
  [7] base64enc_0.1-3          rstudioapi_0.10          ggrepel_0.8.1           
 [10] bit64_0.9-7              AnnotationDbi_1.48.0     splines_3.6.1           
 [13] geneplotter_1.64.0       knitr_1.26               zeallot_0.1.0           
 [16] Formula_1.2-3            Rsamtools_2.2.3          packrat_0.5.0           
 [19] annotate_1.64.0          cluster_2.1.0            GO.db_3.10.0            
 [22] dbplyr_1.4.2             png_0.1-7                pheatmap_1.0.12         
 [25] graph_1.64.0             compiler_3.6.1           httr_1.4.1              
 [28] GOstats_2.52.0           backports_1.1.5          assertthat_0.2.1        
 [31] Matrix_1.2-18            lazyeval_0.2.2           limma_3.42.0            
 [34] htmltools_0.4.0          acepack_1.4.1            prettyunits_1.0.2       
 [37] tools_3.6.1              gtable_0.3.0             glue_1.3.1              
 [40] GenomeInfoDbData_1.2.2   Category_2.52.0          systemPipeR_1.20.0      
 [43] dplyr_0.8.3              batchtools_0.9.11        rappdirs_0.3.1          
 [46] ShortRead_1.44.0         Rcpp_1.0.3               vctrs_0.2.1             
 [49] Biostrings_2.54.0        gdata_2.18.0             rtracklayer_1.46.0      
 [52] xfun_0.11                stringr_1.4.0            lifecycle_0.1.0         
 [55] gtools_3.8.1             XML_3.98-1.20            edgeR_3.28.0            
 [58] zlibbioc_1.32.0          scales_1.1.0             BSgenome_1.54.0         
 [61] VariantAnnotation_1.32.0 hms_0.5.2                RBGL_1.62.0             
 [64] RColorBrewer_1.1-2       yaml_2.2.0               curl_4.3                
 [67] gridExtra_2.3            memoise_1.1.0            ggplot2_3.2.1           
 [70] rpart_4.1-15             biomaRt_2.42.0           latticeExtra_0.6-29     
 [73] stringi_1.4.3            RSQLite_2.1.5            genefilter_1.68.0       
 [76] checkmate_1.9.4          GenomicFeatures_1.38.0   caTools_1.17.1.3        
 [79] rlang_0.4.2              pkgconfig_2.0.3          bitops_1.0-6            
 [82] lattice_0.20-38          purrr_0.3.3              htmlwidgets_1.5.1       
 [85] GenomicAlignments_1.22.0 bit_1.1-14               tidyselect_0.2.5        
 [88] GSEABase_1.48.0          AnnotationForge_1.28.0   magrittr_1.5            
 [91] R6_2.4.1                 gplots_3.0.1.1           Hmisc_4.3-0             
 [94] base64url_1.4            DBI_1.1.0                foreign_0.8-73          
 [97] pillar_1.4.3             withr_2.1.2              nnet_7.3-12             
[100] survival_3.1-8           RCurl_1.95-4.12          tibble_2.1.3            
[103] crayon_1.3.4             KernSmooth_2.23-16       BiocFileCache_1.10.0    
[106] jpeg_0.1-8.1             progress_1.2.2           locfit_1.5-9.1          
[109] grid_3.6.1               data.table_1.12.6        blob_1.2.0              
[112] Rgraphviz_2.30.0         digest_0.6.23            xtable_1.8-4            
[115] brew_1.0-6               openssl_1.4.1            munsell_0.5.0           
[118] askpass_1.1             
diffbind • 737 views
ADD COMMENT
1
Entering edit mode
Rory Stark ★ 5.1k
@rory-stark-5741
Last seen 9 weeks ago
Cambridge, UK

Looks like you're reproducing this pretty well, not sure exactly what is causing the difference.

The upcoming release of DiffBind will support the ability to include arbitrary designs and complex contrasts, made up of any of the metadata factors Tissue, Factor, Condition, Treatment, Replicate, and Caller. It also has a simple interface to retrieve the DESeq2 DESeqDataSet (or the edgeR DGEList) object.

ADD COMMENT
0
Entering edit mode

This functionality has now been released in the current version.

ADD REPLY

Login before adding your answer.

Traffic: 749 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6