Search
Question: DiffBind Heatmap Omitting Results?
0
gravatar for jared.andrews07
7 days ago by
jared.andrews070 wrote:

I'm running into a curious issue with DiffBind's plotHeatmap function when used with a single contrast. It appears to either be omitting some of the results or I'm misusing it. Either way, I'd like to determine the issue.

I've run my analysis and identified ~13000 differentially bound sites:

> k27_results
24 Samples, 55742 sites in matrix:

...

1 Contrast:
  Group1 Members1 Group2 Members2 DB.DESeq2
1 Normal        6  Tumor       18     13252

Great. Now I check how many are in enriched in either the Normal or Tumor samples.

results = dba.report(k27_results)

> sum(results$Fold>0)
[1] 10185
> sum(results$Fold<0)
[1] 3067

Okay, so ~10k enriched in the Normal samples, and ~3k in the Tumor samples. Then I try to make heatmaps to show these.

> dba.plotHeatmap(k27_results, correlations=FALSE, scale="row", density.info="none", colScheme=f_color, breaks=breaks, contrast=1)

alt text

It doesn't appear that the Tumor enriched sites are being displayed. However, when I remove the contrast parameter from the heatmap call, it looks more appropriate, though I thought that displayed all sites, not just those found to be differentially bound. Am I correct in thinking that or am I wrong here? Clarification would be very much appreciated.

> dba.plotHeatmap(k27_results, correlations=FALSE, scale="row", density.info="none", colScheme=f_color, breaks=breaks)

alt text

Let me know if any other information is needed.

Session Info:

> sessionInfo()
R version 3.4.2 (2017-09-28)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] bindrcpp_0.2                            DiffBind_2.6.1                          ReportingTools_2.17.3                  
 [4] knitr_1.17                              BiocInstaller_1.28.0                    DESeq2_1.18.1                          
 [7] SummarizedExperiment_1.8.0              DelayedArray_0.4.1                      matrixStats_0.52.2                     
[10] readr_1.1.1                             tximport_1.6.0                          TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2
[13] GenomicFeatures_1.30.0                  GenomicRanges_1.30.0                    GenomeInfoDb_1.14.0                    
[16] org.Hs.eg.db_3.5.0                      annotate_1.56.1                         XML_3.98-1.9                           
[19] AnnotationDbi_1.40.0                    IRanges_2.12.0                          S4Vectors_0.16.0                       
[22] Biobase_2.38.0                          BiocGenerics_0.24.0                    

loaded via a namespace (and not attached):
  [1] backports_1.1.1               GOstats_2.44.0                Hmisc_4.0-3                   AnnotationHub_2.10.1         
  [5] plyr_1.8.4                    lazyeval_0.2.1                GSEABase_1.40.1               splines_3.4.2                
  [9] BatchJobs_1.7                 BiocParallel_1.12.0           ggplot2_2.2.1                 amap_0.8-14                  
 [13] digest_0.6.12                 ensembldb_2.2.0               htmltools_0.3.6               GO.db_3.5.0                  
 [17] gdata_2.18.0                  magrittr_1.5                  checkmate_1.8.5               memoise_1.1.0                
 [21] BBmisc_1.11                   BSgenome_1.46.0               cluster_2.0.6                 limma_3.34.2                 
 [25] Biostrings_2.46.0             systemPipeR_1.12.0            R.utils_2.6.0                 ggbio_1.26.0                 
 [29] prettyunits_1.0.2             colorspace_1.3-2              ggrepel_0.7.0                 blob_1.1.0                   
 [33] dplyr_0.7.4                   RCurl_1.95-4.8                graph_1.56.0                  genefilter_1.60.0            
 [37] bindr_0.1                     glue_1.2.0                    brew_1.0-6                    survival_2.41-3              
 [41] sendmailR_1.2-1               VariantAnnotation_1.24.2      gtable_0.2.0                  zlibbioc_1.24.0              
 [45] XVector_0.18.0                Rgraphviz_2.22.0              scales_0.5.0                  pheatmap_1.0.8               
 [49] DBI_0.7                       GGally_1.3.2                  edgeR_3.20.1                  Rcpp_0.12.14                 
 [53] xtable_1.8-2                  progress_1.1.2                htmlTable_1.9                 foreign_0.8-69               
 [57] bit_1.1-12                    OrganismDbi_1.20.0            Formula_1.2-2                 AnnotationForge_1.20.0       
 [61] htmlwidgets_0.9               httr_1.3.1                    gplots_3.0.1                  RColorBrewer_1.1-2           
 [65] acepack_1.4.1                 pkgconfig_2.0.1               reshape_0.8.7                 R.methodsS3_1.7.1            
 [69] nnet_7.3-12                   locfit_1.5-9.1                labeling_0.3                  rlang_0.1.4                  
 [73] reshape2_1.4.2                munsell_0.4.3                 tools_3.4.2                   RSQLite_2.0                  
 [77] stringr_1.2.0                 yaml_2.1.14                   bit64_0.9-7                   caTools_1.17.1               
 [81] AnnotationFilter_1.2.0        RBGL_1.54.0                   mime_0.5                      R.oo_1.21.0                  
 [85] biomaRt_2.34.0                compiler_3.4.2                curl_3.0                      interactiveDisplayBase_1.16.0
 [89] PFAM.db_3.5.0                 tibble_1.3.4                  geneplotter_1.56.0            stringi_1.1.6                
 [93] lattice_0.20-35               ProtGenerics_1.10.0           Matrix_1.2-12                 data.table_1.10.4-3          
 [97] bitops_1.0-6                  httpuv_1.3.5                  rtracklayer_1.38.0            R6_2.2.2                     
[101] latticeExtra_0.6-28           hwriter_1.3.2                 RMySQL_0.10.13                ShortRead_1.36.0             
[105] KernSmooth_2.23-15            gridExtra_2.3                 dichromat_2.0-0               gtools_3.5.0                 
[109] assertthat_0.2.0              Category_2.44.0               rjson_0.2.15                  GenomicAlignments_1.14.1     
[113] Rsamtools_1.30.0              GenomeInfoDbData_0.99.1       hms_0.4.0                     grid_3.4.2                   
[117] rpart_4.1-11                  biovizBase_1.26.0             shiny_1.0.5                   base64enc_0.1-3
ADD COMMENTlink modified 6 days ago by Rory Stark2.2k • written 7 days ago by jared.andrews070
2
gravatar for Rory Stark
6 days ago by
Rory Stark2.2k
CRUK, Cambridge, UK
Rory Stark2.2k wrote:

By default, the maxSites parameter of dba.plotHeatmap() is set to maxSites=1000. This means it will only use the "top" 1,000 intervals (the 1,000 sites with the lowest FDR values when you specify a contrast). You can increase the value of maxSites to plot more sites but it will take longer (the complexity is driven by clustering the sites). 

When you leave out the contrast, it will plot the sites with the highest scores determined by the sortFun parameter, which is by default sortFun=sd, so you are seeing the 1,000 with the highest standard deviations.

It may be useful to assign the value returned "invisibly" to a variable, so you can examine the specific sites you are plotting.

-Rory

ADD COMMENTlink written 6 days ago by Rory Stark2.2k

Ah, that clarifies things exceptionally well. Thanks!

ADD REPLYlink written 5 days ago by jared.andrews070
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 308 users visited in the last hour