Hi there!
I am currently using DiffBind v3.2.7 to analyse some ChIP-seq data for RNA Polymerase III (RNAPIII). We have Drosophila spike-in chromatin in the samples that I would like to use in DiffBind to normalise the data. The problem is that during library prep, some of the samples accidentally got different proportions of the spike-in chromatin relative to sample chromatin.
My question is whether this can be accounted for in DiffBind. What I have tried is calculating factors for each sample which are the %spike-in for sample / min %spike-in of all samples. I then thought to multiply the values in dba$norm$DESeq2$norm.facs
by these new factors before dba.analyze()
. I multiply here since I believe these norm.facs are used to divide counts during analysis (therefore, libraries with more spike-in get bigger norm.facs, which results in down scaling when divided during analysis). Please let me know if this makes sense and is ok to do in any way, or if there is anything else that can be done (like sampling the bam files to achieve similar read counts before DiffBind). Thanks!
sessionInfo( )
R version 4.1.1 (2021-08-10)
Platform: x86_64-suse-linux-gnu (64-bit)
Running under: SUSE Linux Enterprise Server 12 SP4
Matrix products: default
BLAS: /usr/lib64/R/lib/libRblas.so
LAPACK: /usr/lib64/R/lib/libRlapack.so
locale:
[1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C LC_TIME=en_GB.UTF-8
[4] LC_COLLATE=en_GB.UTF-8 LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8
[7] LC_PAPER=en_GB.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] dplyr_1.0.7 DiffBind_3.2.7 profileplyr_1.8.1
[4] SummarizedExperiment_1.22.0 Biobase_2.52.0 GenomicRanges_1.44.0
[7] GenomeInfoDb_1.28.4 IRanges_2.26.0 S4Vectors_0.30.0
[10] MatrixGenerics_1.4.3 matrixStats_0.60.1 BiocGenerics_0.38.0