[DiffBind] Obtaining normalization factors
0
1
Entering edit mode
Henry ▴ 10
@40e2dbef
Last seen 5 months ago
United Kingdom

Hi there, may I ask how i can obtain the normalization factors for each sample for this offsets and Loess normalization? I would like to scale my bigwig file according to the normalisation factors. However, I can't seem to find the $norm.facs object within tumor_normal_normalized_loess_blacklisted, which can be found in RLE or TMM normalization.

I would like to scale my bigwig files using the scaleFactor argument of bamCoverage from deepTools. Also, should I scale my bigwig files according to their respective normalisation factors exactly, or the inverse of the normalization factors generated from diffBind? Thank you!

================

Codes:

library(DiffBind)
library(edgeR)
library(tidyverse)
library(dplyr)

samples <- read.csv('Sample.csv')

# create a dba object which contains the peaksets of the samples
dbObj <-dba(sampleSheet = samples)
dbObj

# ========= Obtain consensus peaks ===========
# Get the peakset for tumor and normal
tumor_normal_peaks <- dba.peakset(dbObj, 
                                  consensus=DBA_FACTOR, 
                                  minOverlap = 2) 

tumor_normal_overlap <- dba(tumor_normal_peaks,
                            mask=tumor_normal_peaks$masks$Consensus,
                            minOverlap = 2)

consensus_peaks <- dba.peakset(tumor_normal_overlap, 
                               bRetrieve = TRUE) 

consensus_peaks

# ============ Affinity binding analysis ================
# Extract the tumor-normal overlap peaks from the peaksets of each individual sample

tumor_normal_count_loess <- dba.count(dbObj, 
                                      peaks=consensus_peaks, #specify the peak source. Peaks retrieved from tumor_normal_overlap
                                      bUseSummarizeOverlaps = TRUE)

# Remove unwanted blacklists
tumor_normal_count_loess_blacklisted <-dba.blacklist(tumor_normal_count_loess, blacklist = DBA_BLACKLIST_HG38, greylist = FALSE)

# ============ Loess fit Normalisation ==================

tumor_normal_count_loess_blacklisted$config$AnalysisMethod <- DBA_EDGER
tumor_normal_count_loess_blacklisted$config$AnalysisMethod

tumor_normal_normalized_loess_blacklisted <- dba.normalize(tumor_normal_count_loess_blacklisted,
                                               offsets = TRUE)

norm <- dba.normalize(tumor_normal_normalized_loess_blacklisted, 
                      bRetrieve  = TRUE)

> norm
$norm.method
[1] "adjust offsets"

$lib.method
[1] "RiP"

$lib.sizes
T5B_1 T5B_3 T5B_4 T13A_1 T13A_2  N1_2 N1_L3 
     1958034      4842208      3609605       619744      4863412       806633       566692 
    276_N   276N_L1 
      711525       336097 

$filter.value
[1] 1
> sessionInfo()

Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Sonoma 14.1.2

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

tzcode source: internal

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] lubridate_1.9.3             forcats_1.0.0              
 [3] stringr_1.5.1               dplyr_1.1.4                
 [5] purrr_1.0.2                 readr_2.1.5                
 [7] tidyr_1.3.1                 tibble_3.2.1               
 [9] ggplot2_3.5.0               tidyverse_2.0.0            
[11] edgeR_4.0.16                limma_3.58.1               
[13] DiffBind_3.12.0             SummarizedExperiment_1.32.0
[15] Biobase_2.62.0              MatrixGenerics_1.14.0      
[17] matrixStats_1.2.0           GenomicRanges_1.54.1       
[19] GenomeInfoDb_1.38.8         IRanges_2.36.0             
[21] S4Vectors_0.40.2            BiocGenerics_0.48.1        

loaded via a namespace (and not attached):
 [1] bitops_1.0-7             deldir_2.0-4            
 [3] rlang_1.1.3              magrittr_2.0.3          
 [5] compiler_4.3.3           png_0.1-8               
 [7] vctrs_0.6.5              pkgconfig_2.0.3         
 [9] crayon_1.5.2             fastmap_1.1.1           
[11] XVector_0.42.0           caTools_1.18.2          
[13] utf8_1.2.4               Rsamtools_2.18.0        
[15] tzdb_0.4.0               zlibbioc_1.48.2         
[17] DelayedArray_0.28.0      BiocParallel_1.36.0     
[19] jpeg_0.1-10              irlba_2.3.5.1           
[21] parallel_4.3.3           R6_2.5.1                
[23] stringi_1.8.3            RColorBrewer_1.1-3      
[25] SQUAREM_2021.1           rtracklayer_1.62.0      
[27] numDeriv_2016.8-1.1      Rcpp_1.0.12             
[29] timechange_0.3.0         Matrix_1.6-5            
[31] tidyselect_1.2.1         rstudioapi_0.16.0       
[33] abind_1.4-5              yaml_2.3.8              
[35] gplots_3.1.3.1           codetools_0.2-20        
[37] hwriter_1.3.2.1          lattice_0.22-6          
[39] plyr_1.8.9               withr_3.0.0             
[41] ShortRead_1.60.0         coda_0.19-4.1           
[43] Biostrings_2.70.3        pillar_1.9.0            
[45] KernSmooth_2.23-22       generics_0.1.3          
[47] invgamma_1.1             RCurl_1.98-1.14         
[49] truncnorm_1.0-9          emdbook_1.3.13          
[51] hms_1.1.3                munsell_0.5.1           
[53] scales_1.3.0             ashr_2.2-63             
[55] gtools_3.9.5             glue_1.7.0              
[57] tools_4.3.3              apeglm_1.24.0           
[59] interp_1.1-6             BiocIO_1.12.0           
[61] BSgenome_1.70.2          locfit_1.5-9.9          
[63] GenomicAlignments_1.38.2 systemPipeR_2.8.0       
[65] mvtnorm_1.2-4            XML_3.99-0.16.1         
[67] grid_4.3.3               bbmle_1.0.25.1          
[69] amap_0.8-19              bdsmatrix_1.3-7         
[71] latticeExtra_0.6-30      colorspace_2.1-0        
[73] GenomeInfoDbData_1.2.11  restfulr_0.0.15         
[75] cli_3.6.2                GreyListChIP_1.34.0     
[77] fansi_1.0.6              mixsqp_0.3-54           
[79] S4Arrays_1.2.1           gtable_0.3.4            
[81] DESeq2_1.42.1            digest_0.6.35           
[83] SparseArray_1.2.4        ggrepel_0.9.5           
[85] rjson_0.2.21             htmlwidgets_1.6.4       
[87] htmltools_0.5.8.1        lifecycle_1.0.4         
[89] statmod_1.5.0            MASS_7.3-60.0.1
DiffBind • 542 views
ADD COMMENT
1
Entering edit mode

When you use offsets then there is not a single size factor but an offset matrix with one value per peak and sample. Not sure what you could do here other than running a separate normalization without offsets to get a size factor. I am not aware of a method to scale a bigwig with an offset matrix simply because offsets are per peak and bigwigs are per base/window.

ADD REPLY
0
Entering edit mode

I see, thank you for your suggestion! I managed to use a separate RLE normalization to obtain the size factors.

ADD REPLY

Login before adding your answer.

Traffic: 578 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6