erorr in normalization with DiffBind: pv$binding[, colnum] <- pv$peaks[[i]]$RPKM: number of items to replace is not a multiple of replacement length
Entering edit mode
Last seen 6 months ago

Hi all,

I am trying to analyze ATACseq data from mouse samples. I have 3 groups, with two samples each. I aligned with bowtie to mm10 and called peaks using MACS2, all with the Encode ATACseq pipeline.

I now want to call differentially opened chromatin regions with DiffBind, but it gives me an error when I want to normalize. Also when I run dba.analyze, it gives me the same error. I don't understand what I am doing wrong?

I appreciate any input! best, Ben


dir <- "/staging/leuven/stg_00072/Ben/ATACseq"

## read in annotation table
dba <- read.csv(file.path(dir, "samples.csv"), sep = ";")

## read in samples
dba <- dba(sampleSheet = dba)

## count
dba <- dba.count(dba)

## add contrast
dba <- dba.contrast(dba, minMembers = 2)

## remove blacklisted regions
dba <- dba.blacklist(dba)

## normalize data
dba.norm <- dba.normalize(dba)

After running this code i get the following error:

Error in pv$binding[, colnum] <- pv$peaks[[i]]$RPKM: number of items to replace is not a multiple of replacement length

1. dba.normalize(dba)
2. pv.doResetScore(res)
3. pv.setScore(pv, score = DBA_SCORE_NORMALIZED, bLog = FALSE)
4. pv.doSetScore(pv, DBA_SCORE_RPKM)

It gives (offcourse) the same after just running dba.analyze:

Applying Blacklist/Greylists...

Genome detected: Mmusculus.UCSC.mm10

Applying blacklist...

Removed: 2392 of 151927 intervals.

Removed 2392 (of 151927) consensus peaks.

Normalize DESeq2 with defaults...

Normalize error: Error in pv$binding[, colnum] <- pv$peaks[[i]]$RPKM: number of items to replace is not a multiple of replacement length

Unable to normalize datset with DESeq2.


R version 4.0.3 (2020-10-10)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /vsc-hard-mounts/leuven-data/328/vsc32808/miniconda/envs/R_base_5/lib/

[1] C

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] DiffBind_3.0.15             SummarizedExperiment_1.20.0
 [3] Biobase_2.50.0              MatrixGenerics_1.2.1       
 [5] matrixStats_0.58.0          GenomicRanges_1.42.0       
 [7] GenomeInfoDb_1.26.4         IRanges_2.24.1             
 [9] S4Vectors_0.28.1            BiocGenerics_0.36.0        

loaded via a namespace (and not attached):
  [1] uuid_0.1-4               backports_1.2.1          GOstats_2.56.0          
  [4] BiocFileCache_1.14.0     plyr_1.8.6               repr_1.1.3              
  [7] GSEABase_1.52.1          splines_4.0.3            BiocParallel_1.24.1     
 [10] ggplot2_3.3.3            amap_0.8-18              digest_0.6.27           
 [13] invgamma_1.1             htmltools_0.5.1.1        GO.db_3.12.1            
 [16] SQUAREM_2021.1           fansi_0.4.2              magrittr_2.0.1          
 [19] checkmate_2.0.0          memoise_2.0.0            BSgenome_1.58.0         
 [22] base64url_1.4            limma_3.46.0             Biostrings_2.58.0       
 [25] annotate_1.68.0          systemPipeR_1.24.3       askpass_1.1             
 [28] bdsmatrix_1.3-4          prettyunits_1.1.1        jpeg_0.1-8.1            
 [31] colorspace_2.0-1         blob_1.2.1               rappdirs_0.3.3          
 [34] apeglm_1.12.0            ggrepel_0.9.1            dplyr_1.0.5             
 [37] crayon_1.4.1             RCurl_1.98-1.3           jsonlite_1.7.2          
 [40] graph_1.68.0             genefilter_1.72.1        brew_1.0-6              
 [43] survival_3.2-11          VariantAnnotation_1.36.0 glue_1.4.2              
 [46] gtable_0.3.0             zlibbioc_1.36.0          XVector_0.30.0          
 [49] DelayedArray_0.16.3      V8_3.4.0                 Rgraphviz_2.34.0        
 [52] scales_1.1.1             pheatmap_1.0.12          mvtnorm_1.1-1           
 [55] DBI_1.1.1                edgeR_3.32.1             Rcpp_1.0.6              
 [58] xtable_1.8-4             progress_1.2.2           emdbook_1.3.12          
 [61] bit_4.0.4                rsvg_2.1                 truncnorm_1.0-8         
 [64] AnnotationForge_1.32.0   httr_1.4.2               gplots_3.1.1            
 [67] RColorBrewer_1.1-2       ellipsis_0.3.2           pkgconfig_2.0.3         
 [70] XML_3.99-0.6             dbplyr_2.1.1             locfit_1.5-9.4          
 [73] utf8_1.2.1               tidyselect_1.1.1         rlang_0.4.11            
 [76] AnnotationDbi_1.52.0     munsell_0.5.0            tools_4.0.3             
 [79] cachem_1.0.4             generics_0.1.0           RSQLite_2.2.5           
 [82] evaluate_0.14            stringr_1.4.0            fastmap_1.1.0           
 [85] yaml_2.2.1               bit64_4.0.5              caTools_1.18.2          
 [88] purrr_0.3.4              RBGL_1.66.0              xml2_1.3.2              
 [91] biomaRt_2.46.3           compiler_4.0.3           curl_4.3.1              
 [94] png_0.1-7                geneplotter_1.68.0       tibble_3.1.1            
 [97] stringi_1.5.3            GenomicFeatures_1.42.2   lattice_0.20-44         
[100] IRdisplay_1.0            Matrix_1.3-2             vctrs_0.3.8             
[103] pillar_1.6.0             lifecycle_1.0.0          irlba_2.3.3             
[106] data.table_1.14.0        bitops_1.0-7             rtracklayer_1.50.0      
[109] R6_2.5.0                 latticeExtra_0.6-29      hwriter_1.3.2           
[112] ShortRead_1.48.0         KernSmooth_2.23-18       MASS_7.3-54             
[115] gtools_3.8.2             assertthat_0.2.1         DESeq2_1.30.1           
[118] openssl_1.4.4            Category_2.56.0          rjson_0.2.20            
[121] withr_2.4.2              GenomicAlignments_1.26.0 batchtools_0.9.15       
[124] Rsamtools_2.6.0          GenomeInfoDbData_1.2.4   hms_1.0.0               
[127] grid_4.0.3               IRkernel_1.1.1           DOT_0.1                 
[130] coda_0.19-4              GreyListChIP_1.22.0      ashr_2.2-47             
[133] mixsqp_0.3-43            pbdZMQ_0.3-5             bbmle_1.0.23.1          
[136] numDeriv_2016.8-1.1      base64enc_0.1-3
DiffBind • 324 views
Entering edit mode

If you could give me access to your DBA object after calling dba.count() (eg download link) I can take a look at what is going on -- I'd definitely like to get to the bottom of this!

Entering edit mode

Sure! Can I send you an email/pm?

Entering edit mode

Email me at the DiffBind maintainer address.

Entering edit mode
Rory Stark ★ 4.2k
Last seen 8 weeks ago
CRUK, Cambridge, UK

I have confirmed that this is a bug. The fix will appear in the next release (Bioconductor 2.13/DiffBind_3.2.0) later this week as it is too late to check a fix into the current release.

There is a workaround you can try, which is to call dba.blacklist() prior to calling dba.count(), instead of afterwards. This is the preferred way of blacklisting, as you remove suspect peaks prior to forming a consensus peakset.

If you have a particular reason to run the blacklist after counting reads in the consensus, you can also try running dba() immediately after calling dba.blacklist():

dba <- dba.blacklist(dba)
dba <- dba(dba)
dba.norm <- dba.normalize(dba)

Also, It is not a good idea to call your object dba, the same name as a core DiffBind function, if you can avoid it!

Entering edit mode

Thanks, it works!

I did call the object in my script different, but for clarity here I changed it.

best, Ben


Login before adding your answer.

Traffic: 517 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6