Hi there, may I ask how i can obtain the normalization factors for each sample for this offsets and Loess normalization? I would like to scale my bigwig file according to the normalisation factors. However, I can't seem to find the $norm.facs
object within tumor_normal_normalized_loess_blacklisted
, which can be found in RLE or TMM normalization.
I would like to scale my bigwig files using the scaleFactor
argument of bamCoverage
from deepTools
. Also, should I scale my bigwig files according to their respective normalisation factors exactly, or the inverse of the normalization factors generated from diffBind?
Thank you!
================
Codes:
library(DiffBind)
library(edgeR)
library(tidyverse)
library(dplyr)
samples <- read.csv('Sample.csv')
# create a dba object which contains the peaksets of the samples
dbObj <-dba(sampleSheet = samples)
dbObj
# ========= Obtain consensus peaks ===========
# Get the peakset for tumor and normal
tumor_normal_peaks <- dba.peakset(dbObj,
consensus=DBA_FACTOR,
minOverlap = 2)
tumor_normal_overlap <- dba(tumor_normal_peaks,
mask=tumor_normal_peaks$masks$Consensus,
minOverlap = 2)
consensus_peaks <- dba.peakset(tumor_normal_overlap,
bRetrieve = TRUE)
consensus_peaks
# ============ Affinity binding analysis ================
# Extract the tumor-normal overlap peaks from the peaksets of each individual sample
tumor_normal_count_loess <- dba.count(dbObj,
peaks=consensus_peaks, #specify the peak source. Peaks retrieved from tumor_normal_overlap
bUseSummarizeOverlaps = TRUE)
# Remove unwanted blacklists
tumor_normal_count_loess_blacklisted <-dba.blacklist(tumor_normal_count_loess, blacklist = DBA_BLACKLIST_HG38, greylist = FALSE)
# ============ Loess fit Normalisation ==================
tumor_normal_count_loess_blacklisted$config$AnalysisMethod <- DBA_EDGER
tumor_normal_count_loess_blacklisted$config$AnalysisMethod
tumor_normal_normalized_loess_blacklisted <- dba.normalize(tumor_normal_count_loess_blacklisted,
offsets = TRUE)
norm <- dba.normalize(tumor_normal_normalized_loess_blacklisted,
bRetrieve = TRUE)
> norm
$norm.method
[1] "adjust offsets"
$lib.method
[1] "RiP"
$lib.sizes
T5B_1 T5B_3 T5B_4 T13A_1 T13A_2 N1_2 N1_L3
1958034 4842208 3609605 619744 4863412 806633 566692
276_N 276N_L1
711525 336097
$filter.value
[1] 1
> sessionInfo()
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Sonoma 14.1.2
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
tzcode source: internal
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] lubridate_1.9.3 forcats_1.0.0
[3] stringr_1.5.1 dplyr_1.1.4
[5] purrr_1.0.2 readr_2.1.5
[7] tidyr_1.3.1 tibble_3.2.1
[9] ggplot2_3.5.0 tidyverse_2.0.0
[11] edgeR_4.0.16 limma_3.58.1
[13] DiffBind_3.12.0 SummarizedExperiment_1.32.0
[15] Biobase_2.62.0 MatrixGenerics_1.14.0
[17] matrixStats_1.2.0 GenomicRanges_1.54.1
[19] GenomeInfoDb_1.38.8 IRanges_2.36.0
[21] S4Vectors_0.40.2 BiocGenerics_0.48.1
loaded via a namespace (and not attached):
[1] bitops_1.0-7 deldir_2.0-4
[3] rlang_1.1.3 magrittr_2.0.3
[5] compiler_4.3.3 png_0.1-8
[7] vctrs_0.6.5 pkgconfig_2.0.3
[9] crayon_1.5.2 fastmap_1.1.1
[11] XVector_0.42.0 caTools_1.18.2
[13] utf8_1.2.4 Rsamtools_2.18.0
[15] tzdb_0.4.0 zlibbioc_1.48.2
[17] DelayedArray_0.28.0 BiocParallel_1.36.0
[19] jpeg_0.1-10 irlba_2.3.5.1
[21] parallel_4.3.3 R6_2.5.1
[23] stringi_1.8.3 RColorBrewer_1.1-3
[25] SQUAREM_2021.1 rtracklayer_1.62.0
[27] numDeriv_2016.8-1.1 Rcpp_1.0.12
[29] timechange_0.3.0 Matrix_1.6-5
[31] tidyselect_1.2.1 rstudioapi_0.16.0
[33] abind_1.4-5 yaml_2.3.8
[35] gplots_3.1.3.1 codetools_0.2-20
[37] hwriter_1.3.2.1 lattice_0.22-6
[39] plyr_1.8.9 withr_3.0.0
[41] ShortRead_1.60.0 coda_0.19-4.1
[43] Biostrings_2.70.3 pillar_1.9.0
[45] KernSmooth_2.23-22 generics_0.1.3
[47] invgamma_1.1 RCurl_1.98-1.14
[49] truncnorm_1.0-9 emdbook_1.3.13
[51] hms_1.1.3 munsell_0.5.1
[53] scales_1.3.0 ashr_2.2-63
[55] gtools_3.9.5 glue_1.7.0
[57] tools_4.3.3 apeglm_1.24.0
[59] interp_1.1-6 BiocIO_1.12.0
[61] BSgenome_1.70.2 locfit_1.5-9.9
[63] GenomicAlignments_1.38.2 systemPipeR_2.8.0
[65] mvtnorm_1.2-4 XML_3.99-0.16.1
[67] grid_4.3.3 bbmle_1.0.25.1
[69] amap_0.8-19 bdsmatrix_1.3-7
[71] latticeExtra_0.6-30 colorspace_2.1-0
[73] GenomeInfoDbData_1.2.11 restfulr_0.0.15
[75] cli_3.6.2 GreyListChIP_1.34.0
[77] fansi_1.0.6 mixsqp_0.3-54
[79] S4Arrays_1.2.1 gtable_0.3.4
[81] DESeq2_1.42.1 digest_0.6.35
[83] SparseArray_1.2.4 ggrepel_0.9.5
[85] rjson_0.2.21 htmlwidgets_1.6.4
[87] htmltools_0.5.8.1 lifecycle_1.0.4
[89] statmod_1.5.0 MASS_7.3-60.0.1
When you use offsets then there is not a single size factor but an offset matrix with one value per peak and sample. Not sure what you could do here other than running a separate normalization without offsets to get a size factor. I am not aware of a method to scale a bigwig with an offset matrix simply because offsets are per peak and bigwigs are per base/window.
I see, thank you for your suggestion! I managed to use a separate RLE normalization to obtain the size factors.