Exact same results in Diffbind with different normalization methods
0
0
Entering edit mode
red_bricks ▴ 60
@red_bricks-14034
Last seen 6 months ago
United States

I am comparing the results of DESeq2 'background' normalization with library size normalization with two different spike-ins (specified using the SpikeIn column in the samplesheet). I am getting the exact same Pvalues for all three methods. I did check that the DBA$norm$DESeq2$norm.facs values are different in each case. The MA plot using normalized counts also does not seem to be consistent with the DB sites called; lots of up-regulated sites called even though the distribution is globally shifted to <0 fold change.

dba.plotMA(diffbind, contrast=1, th = fdr_cutoff)

MA_plot

# make DBA object
diffbind <- dba(sampleSheet = samples %>% dplyr::select(-out_pref))

# set num cores and how many reads to store in memory
diffbind$config$cores <- db_config$cores-1
diffbind$config$yieldSize <- 20000000

# get counts
diffbind$config$scanbamparam <- ScanBamParam(flag = scanBamFlag(isDuplicate=FALSE, isSecondaryAlignment=FALSE, isSupplementaryAlignment=FALSE, isPaired=TRUE, isProperPair=TRUE, isNotPassingQualityControls=FALSE, isUnmappedQuery=FALSE, hasUnmappedMate=FALSE, isMinusStrand=NA, isMateMinusStrand=NA))


# get counts

diffbind <- dba.count(diffbind, summits=DB_summits, bUseSummarizeOverlaps = TRUE, bParallel = TRUE)
print(diffbind)

# normalization
if ("Spikein" %in% colnames(samples)){
    diffbind <- dba.normalize(diffbind, normalize=DBA_NORM_LIB, spikein=TRUE)
} else{
    diffbind <- dba.normalize(diffbind, normalize = DBA_NORM_NATIVE, background = TRUE)
}

# print the normalization methods
dba.normalize(diffbind, bRetrieve=TRUE)

comps <- list(
  list("GroupA", "GroupB")
)

for (comp in comps){
  diffbind <- dba.contrast(diffbind, 
                           group1=diffbind$masks[[ comp[[1]] ]], 
                           group2=diffbind$masks[[ comp[[2]] ]],
                           name1=comp[[1]], 
                           name2=comp[[2]])
}

dba.show(diffbind, bContrasts=TRUE)

diffbind <- dba.analyze(diffbind, bBlacklist=FALSE, bGreylist=FALSE)
dba.show(diffbind, bContrasts = TRUE)
## R version 4.2.1 (2022-06-23)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: AlmaLinux 9.0 (Emerald Puma)
## 
## Matrix products: default
## BLAS:   /opt/R/R-4.2.1/lib64/R/lib/libRblas.so
## LAPACK: /opt/R/R-4.2.1/lib64/R/lib/libRlapack.so
## 
## locale:
##  [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
##  [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
##  [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
## [10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] enrichplot_1.18.4                        
##  [2] readxl_1.4.2                             
##  [3] openxlsx_4.2.5.2                         
##  [4] ReactomePA_1.42.0                        
##  [5] DOSE_3.24.2                              
##  [6] clusterProfiler_4.6.2                    
##  [7] org.Mm.eg.db_3.16.0                      
##  [8] ChIPseeker_1.34.1                        
##  [9] TxDb.Mmusculus.UCSC.mm10.knownGene_3.10.0
## [10] GenomicFeatures_1.50.4                   
## [11] AnnotationDbi_1.60.2                     
## [12] cowplot_1.1.1                            
## [13] ggplot2_3.4.2                            
## [14] lattice_0.21-8                           
## [15] gridExtra_2.3                            
## [16] patchwork_1.1.2                          
## [17] rtracklayer_1.58.0                       
## [18] Rsamtools_2.14.0                         
## [19] Biostrings_2.66.0                        
## [20] XVector_0.38.0                           
## [21] kableExtra_1.3.4                         
## [22] DiffBind_3.8.4                           
## [23] SummarizedExperiment_1.28.0              
## [24] Biobase_2.58.0                           
## [25] MatrixGenerics_1.10.0                    
## [26] matrixStats_0.63.0                       
## [27] GenomicRanges_1.50.2                     
## [28] GenomeInfoDb_1.34.9                      
## [29] IRanges_2.32.0                           
## [30] S4Vectors_0.36.2                         
## [31] BiocGenerics_0.44.0                      
## [32] readr_2.1.4                              
## [33] stringr_1.5.0                            
## [34] dplyr_1.1.2                              
## 
## loaded via a namespace (and not attached):
##   [1] utf8_1.2.3                             
##   [2] tidyselect_1.2.0                       
##   [3] RSQLite_2.3.1                          
##   [4] htmlwidgets_1.6.2                      
##   [5] grid_4.2.1                             
##   [6] BiocParallel_1.32.6                    
##   [7] scatterpie_0.1.9                       
##   [8] munsell_0.5.0                          
##   [9] codetools_0.2-19                       
##  [10] interp_1.1-4                           
##  [11] systemPipeR_2.4.0                      
##  [12] withr_2.5.0                            
##  [13] colorspace_2.1-0                       
##  [14] GOSemSim_2.24.0                        
##  [15] filelock_1.0.2                         
##  [16] highr_0.10                             
##  [17] knitr_1.42                             
##  [18] rstudioapi_0.14                        
##  [19] labeling_0.4.2                         
##  [20] bbmle_1.0.25                           
##  [21] GenomeInfoDbData_1.2.9                 
##  [22] mixsqp_0.3-48                          
##  [23] hwriter_1.3.2.1                        
##  [24] polyclip_1.10-4                        
##  [25] bit64_4.0.5                            
##  [26] farver_2.1.1                           
##  [27] downloader_0.4                         
##  [28] treeio_1.22.0                          
##  [29] coda_0.19-4                            
##  [30] vctrs_0.6.2                            
##  [31] generics_0.1.3                         
##  [32] gson_0.1.0                             
##  [33] xfun_0.39                              
##  [34] BiocFileCache_2.6.1                    
##  [35] R6_2.5.1                               
##  [36] apeglm_1.20.0                          
##  [37] graphlayouts_0.8.4                     
##  [38] invgamma_1.1                           
##  [39] locfit_1.5-9.7                         
##  [40] bitops_1.0-7                           
##  [41] cachem_1.0.7                           
##  [42] fgsea_1.24.0                           
##  [43] gridGraphics_0.5-1                     
##  [44] DelayedArray_0.24.0                    
##  [45] BiocIO_1.8.0                           
##  [46] scales_1.2.1                           
##  [47] ggraph_2.1.0                           
##  [48] gtable_0.3.3                           
##  [49] tidygraph_1.2.3                        
##  [50] rlang_1.1.3                            
##  [51] systemfonts_1.0.4                      
##  [52] splines_4.2.1                          
##  [53] lazyeval_0.2.2                         
##  [54] yaml_2.3.7                             
##  [55] reshape2_1.4.4                         
##  [56] qvalue_2.30.0                          
##  [57] tools_4.2.1                            
##  [58] ggplotify_0.1.0                        
##  [59] gplots_3.1.3                           
##  [60] jquerylib_0.1.4                        
##  [61] RColorBrewer_1.1-3                     
##  [62] Rcpp_1.0.10                            
##  [63] plyr_1.8.8                             
##  [64] progress_1.2.2                         
##  [65] zlibbioc_1.44.0                        
##  [66] purrr_1.0.1                            
##  [67] RCurl_1.98-1.12                        
##  [68] prettyunits_1.1.1                      
##  [69] deldir_1.0-6                           
##  [70] viridis_0.6.2                          
##  [71] ashr_2.2-54                            
##  [72] ggrepel_0.9.3                          
##  [73] magrittr_2.0.3                         
##  [74] data.table_1.14.8                      
##  [75] reactome.db_1.82.0                     
##  [76] truncnorm_1.0-9                        
##  [77] mvtnorm_1.1-3                          
##  [78] SQUAREM_2021.1                         
##  [79] amap_0.8-19                            
##  [80] xtable_1.8-4                           
##  [81] hms_1.1.3                              
##  [82] evaluate_0.23                          
##  [83] HDO.db_0.99.1                          
##  [84] XML_3.99-0.14                          
##  [85] emdbook_1.3.12                         
##  [86] jpeg_0.1-10                            
##  [87] compiler_4.2.1                         
##  [88] biomaRt_2.54.1                         
##  [89] bdsmatrix_1.3-6                        
##  [90] tibble_3.2.1                           
##  [91] shadowtext_0.1.2                       
##  [92] KernSmooth_2.23-20                     
##  [93] crayon_1.5.2                           
##  [94] htmltools_0.5.5                        
##  [95] ggfun_0.0.9                            
##  [96] tzdb_0.3.0                             
##  [97] geneplotter_1.76.0                     
##  [98] tidyr_1.3.0                            
##  [99] aplot_0.1.10                           
## [100] DBI_1.1.3                              
## [101] tweenr_2.0.2                           
## [102] dbplyr_2.3.2                           
## [103] MASS_7.3-59                            
## [104] rappdirs_0.3.3                         
## [105] boot_1.3-28.1                          
## [106] ShortRead_1.56.1                       
## [107] Matrix_1.5-4                           
## [108] cli_3.6.1                              
## [109] parallel_4.2.1                         
## [110] igraph_1.4.2                           
## [111] pkgconfig_2.0.3                        
## [112] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2
## [113] GenomicAlignments_1.34.1               
## [114] numDeriv_2016.8-1.1                    
## [115] xml2_1.3.3                             
## [116] annotate_1.76.0                        
## [117] ggtree_3.6.2                           
## [118] svglite_2.1.1                          
## [119] bslib_0.4.2                            
## [120] webshot_0.5.4                          
## [121] rvest_1.0.3                            
## [122] yulab.utils_0.0.6                      
## [123] digest_0.6.34                          
## [124] graph_1.76.0                           
## [125] cellranger_1.1.0                       
## [126] rmarkdown_2.21                         
## [127] fastmatch_1.1-3                        
## [128] tidytree_0.4.2                         
## [129] restfulr_0.0.15                        
## [130] GreyListChIP_1.30.0                    
## [131] curl_5.0.1                             
## [132] graphite_1.44.0                        
## [133] gtools_3.9.4                           
## [134] rjson_0.2.21                           
## [135] lifecycle_1.0.3                        
## [136] nlme_3.1-162                           
## [137] jsonlite_1.8.7                         
## [138] viridisLite_0.4.1                      
## [139] limma_3.54.2                           
## [140] BSgenome_1.66.3                        
## [141] fansi_1.0.4                            
## [142] pillar_1.9.0                           
## [143] plotrix_3.8-2                          
## [144] KEGGREST_1.38.0                        
## [145] fastmap_1.1.1                          
## [146] httr_1.4.5                             
## [147] GO.db_3.16.0                           
## [148] glue_1.6.2                             
## [149] zip_2.3.0                              
## [150] png_0.1-8                              
## [151] bit_4.0.5                              
## [152] ggforce_0.4.1                          
## [153] stringi_1.7.12                         
## [154] sass_0.4.5                             
## [155] blob_1.2.4                             
## [156] DESeq2_1.38.3                          
## [157] latticeExtra_0.6-30                    
## [158] caTools_1.18.2                         
## [159] memoise_2.0.1                          
## [160] irlba_2.3.5.1                          
## [161] ape_5.7-1
DiffBind • 424 views
ADD COMMENT
0
Entering edit mode

To help make it easier to reproduce this issue, I was able to reproduce similar behavior using the included tamoxifen dataset. I also realized that while the PValues are identical, the 'Conc' columns are distinct.

Packages loaded

library(DiffBind)

data(tamoxifen_counts)

Library size norm

libsize <- dba.normalize(tamoxifen, normalize = DBA_NORM_LIB, background = FALSE)
dba.normalize(libsize, bRetrieve = TRUE)

libsize <- dba.contrast(libsize, 
                        group1=libsize$masks[["Resistant"]], 
                        group2=libsize$masks[["Responsive"]],
                        name1="resis", 
                        name2="respo")

libsize <- dba.analyze(libsize, bBlacklist=FALSE, bGreylist=FALSE)

libsize_res <- dba.report(libsize, th=1, contrast = 1)
$norm.method
[1] "lib"

$norm.factors
 [1] 0.8099610 0.8232056 0.4298993 0.4567323 0.5786191 0.4962278 1.8309087 0.7652039 0.7361583 0.8771445 3.1959394

$lib.method
[1] "full"

$lib.sizes
 [1]  652697  663370  346429  368052  466273  399879 1475415  616630  593224  706836 2575408

$filter.value
[1] 1

Analyzing...
converting counts to integer mode
gene-wise dispersion estimates
mean-dispersion relationship
final dispersion estimates

Background counts RLE

bkgd <- dba.normalize(tamoxifen, normalize = DBA_NORM_NATIVE, background = TRUE)
dba.normalize(bkgd, bRetrieve = TRUE)

bkgd <- dba.contrast(bkgd, 
                    group1=bkgd$masks[["Resistant"]], 
                    group2=bkgd$masks[["Responsive"]],
                    name1="resis", 
                    name2="respo")

bkgd <- dba.analyze(bkgd, bBlacklist=FALSE, bGreylist=FALSE)

bkgd_res <- dba.report(bkgd, th=1, contrast = 1)
$background
[1] TRUE

$norm.method
[1] "RLE"

$norm.factors
   BT4741    BT4742     MCF71     MCF72     MCF73     T47D1     T47D2    MCF7r1    MCF7r2     ZR751     ZR752 
1.0598725 1.0944302 0.4371205 0.5646943 0.6593434 0.6744580 2.6715909 0.8636261 0.9387891 0.9150974 3.9306407 

$lib.method
[1] "background"

$lib.sizes
 [1]  652697  663370  346429  368052  466273  399879 1475415  616630  593224  706836 2575408

$filter.value
[1] 1

Analyzing...
converting counts to integer mode
gene-wise dispersion estimates
mean-dispersion relationship
final dispersion estimates

Compare results

all.equal(libsize_res$Conc, bkgd_res$Conc)
all.equal(libsize_res$Fold, bkgd_res$Fold)
identical(libsize_res$`p-value`, bkgd_res$`p-value`)
identical(libsize_res$FDR, bkgd_res$FDR)
[1] "Mean relative difference: 0.0494044"
[1] "Mean relative difference: 0.1242798"
[1] TRUE
[1] TRUE

SessionInfo

sessionInfo()
    R version 4.3.0 (2023-04-21)
    Platform: x86_64-pc-linux-gnu (64-bit)
    Running under: AlmaLinux 9.2 (Turquoise Kodkod)

    Matrix products: default
    BLAS:   /varidata/research/software/BBC/R/hpc4/build-4.3.0-00/lib64/R/lib/libRblas.so 
    LAPACK: /varidata/research/software/BBC/R/hpc4/build-4.3.0-00/lib64/R/lib/libRlapack.so;  LAPACK version 3.11.0

    locale:
     [1] LC_CTYPE=C.utf8       LC_NUMERIC=C          LC_TIME=C.utf8        LC_COLLATE=C.utf8     LC_MONETARY=C.utf8   
     [6] LC_MESSAGES=C.utf8    LC_PAPER=C.utf8       LC_NAME=C             LC_ADDRESS=C          LC_TELEPHONE=C       
    [11] LC_MEASUREMENT=C.utf8 LC_IDENTIFICATION=C  

    time zone: America/Detroit
    tzcode source: system (glibc)

    attached base packages:
    [1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

    other attached packages:
     [1] DiffBind_3.10.1             SummarizedExperiment_1.30.2 Biobase_2.60.0              MatrixGenerics_1.12.3      
     [5] matrixStats_1.0.0           GenomicRanges_1.52.1        GenomeInfoDb_1.36.4         IRanges_2.34.1             
     [9] S4Vectors_0.38.2            BiocGenerics_0.46.0        

    loaded via a namespace (and not attached):
     [1] bitops_1.0-7             deldir_1.0-9             rlang_1.1.1              magrittr_2.0.3          
     [5] compiler_4.3.0           png_0.1-8                vctrs_0.6.4              stringr_1.5.1           
     [9] pkgconfig_2.0.3          crayon_1.5.2             fastmap_1.1.1            XVector_0.40.0          
    [13] caTools_1.18.2           utf8_1.2.3               Rsamtools_2.16.0         rmarkdown_2.25          
    [17] xfun_0.40                zlibbioc_1.46.0          DelayedArray_0.26.7      BiocParallel_1.34.2     
    [21] jpeg_0.1-10              irlba_2.3.5.1            parallel_4.3.0           R6_2.5.1                
    [25] stringi_1.7.12           RColorBrewer_1.1-3       SQUAREM_2021.1           limma_3.56.2            
    [29] rtracklayer_1.60.1       numDeriv_2016.8-1.1      Rcpp_1.0.11              knitr_1.44              
    [33] Matrix_1.6-4             tidyselect_1.2.0         rstudioapi_0.15.0        abind_1.4-5             
    [37] yaml_2.3.7               gplots_3.1.3             codetools_0.2-19         hwriter_1.3.2.1         
    [41] lattice_0.21-8           tibble_3.2.1             plyr_1.8.9               ShortRead_1.58.0        
    [45] coda_0.19-4              evaluate_0.22            Biostrings_2.68.1        pillar_1.9.0            
    [49] KernSmooth_2.23-20       generics_0.1.3           invgamma_1.1             RCurl_1.98-1.12         
    [53] truncnorm_1.0-9          emdbook_1.3.13           ggplot2_3.4.4            munsell_0.5.0           
    [57] scales_1.2.1             ashr_2.2-63              gtools_3.9.4             glue_1.6.2              
    [61] tools_4.3.0              apeglm_1.22.1            interp_1.1-4             BiocIO_1.10.0           
    [65] BSgenome_1.68.0          locfit_1.5-9.8           GenomicAlignments_1.36.0 systemPipeR_2.6.3       
    [69] mvtnorm_1.2-3            XML_3.99-0.14            grid_4.3.0               bbmle_1.0.25            
    [73] amap_0.8-19              bdsmatrix_1.3-6          latticeExtra_0.6-30      colorspace_2.1-0        
    [77] GenomeInfoDbData_1.2.10  restfulr_0.0.15          cli_3.6.1                GreyListChIP_1.32.1     
    [81] fansi_1.0.5              mixsqp_0.3-48            S4Arrays_1.0.6           dplyr_1.1.3             
    [85] gtable_0.3.4             DESeq2_1.40.2            digest_0.6.33            ggrepel_0.9.4           
    [89] rjson_0.2.21             htmlwidgets_1.6.2        htmltools_0.5.6.1        lifecycle_1.0.3         
    [93] MASS_7.3-58.4
ADD REPLY

Login before adding your answer.

Traffic: 548 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6