erorr in normalization with DiffBind: pv$binding[, colnum] <- pv$peaks[[i]]$RPKM: number of items to replace is not a multiple of replacement length
1
0
Entering edit mode
@141efca5
Last seen 4.4 years ago
Belgium

Hi all,

I am trying to analyze ATACseq data from mouse samples. I have 3 groups, with two samples each. I aligned with bowtie to mm10 and called peaks using MACS2, all with the Encode ATACseq pipeline.

I now want to call differentially opened chromatin regions with DiffBind, but it gives me an error when I want to normalize. Also when I run dba.analyze, it gives me the same error. I don't understand what I am doing wrong?

I appreciate any input! best, Ben


library(DiffBind)

dir <- "/staging/leuven/stg_00072/Ben/ATACseq"
setwd(dir)

## read in annotation table
dba <- read.csv(file.path(dir, "samples.csv"), sep = ";")

## read in samples
dba <- dba(sampleSheet = dba)

## count
dba <- dba.count(dba)

## add contrast
dba <- dba.contrast(dba, minMembers = 2)

## remove blacklisted regions
dba <- dba.blacklist(dba)

## normalize data
dba.norm <- dba.normalize(dba)

After running this code i get the following error:

Error in pv$binding[, colnum] <- pv$peaks[[i]]$RPKM: number of items to replace is not a multiple of replacement length
Traceback:

1. dba.normalize(dba)
2. pv.doResetScore(res)
3. pv.setScore(pv, score = DBA_SCORE_NORMALIZED, bLog = FALSE)
4. pv.doSetScore(pv, DBA_SCORE_RPKM)

It gives (offcourse) the same after just running dba.analyze:

Applying Blacklist/Greylists...

Genome detected: Mmusculus.UCSC.mm10

Applying blacklist...

Removed: 2392 of 151927 intervals.

Removed 2392 (of 151927) consensus peaks.

Normalize DESeq2 with defaults...

Normalize error: Error in pv$binding[, colnum] <- pv$peaks[[i]]$RPKM: number of items to replace is not a multiple of replacement length


Unable to normalize datset with DESeq2.

sessionInfo():

R version 4.0.3 (2020-10-10)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /vsc-hard-mounts/leuven-data/328/vsc32808/miniconda/envs/R_base_5/lib/libopenblasp-r0.3.15.so

locale:
[1] C

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] DiffBind_3.0.15             SummarizedExperiment_1.20.0
 [3] Biobase_2.50.0              MatrixGenerics_1.2.1       
 [5] matrixStats_0.58.0          GenomicRanges_1.42.0       
 [7] GenomeInfoDb_1.26.4         IRanges_2.24.1             
 [9] S4Vectors_0.28.1            BiocGenerics_0.36.0        

loaded via a namespace (and not attached):
  [1] uuid_0.1-4               backports_1.2.1          GOstats_2.56.0          
  [4] BiocFileCache_1.14.0     plyr_1.8.6               repr_1.1.3              
  [7] GSEABase_1.52.1          splines_4.0.3            BiocParallel_1.24.1     
 [10] ggplot2_3.3.3            amap_0.8-18              digest_0.6.27           
 [13] invgamma_1.1             htmltools_0.5.1.1        GO.db_3.12.1            
 [16] SQUAREM_2021.1           fansi_0.4.2              magrittr_2.0.1          
 [19] checkmate_2.0.0          memoise_2.0.0            BSgenome_1.58.0         
 [22] base64url_1.4            limma_3.46.0             Biostrings_2.58.0       
 [25] annotate_1.68.0          systemPipeR_1.24.3       askpass_1.1             
 [28] bdsmatrix_1.3-4          prettyunits_1.1.1        jpeg_0.1-8.1            
 [31] colorspace_2.0-1         blob_1.2.1               rappdirs_0.3.3          
 [34] apeglm_1.12.0            ggrepel_0.9.1            dplyr_1.0.5             
 [37] crayon_1.4.1             RCurl_1.98-1.3           jsonlite_1.7.2          
 [40] graph_1.68.0             genefilter_1.72.1        brew_1.0-6              
 [43] survival_3.2-11          VariantAnnotation_1.36.0 glue_1.4.2              
 [46] gtable_0.3.0             zlibbioc_1.36.0          XVector_0.30.0          
 [49] DelayedArray_0.16.3      V8_3.4.0                 Rgraphviz_2.34.0        
 [52] scales_1.1.1             pheatmap_1.0.12          mvtnorm_1.1-1           
 [55] DBI_1.1.1                edgeR_3.32.1             Rcpp_1.0.6              
 [58] xtable_1.8-4             progress_1.2.2           emdbook_1.3.12          
 [61] bit_4.0.4                rsvg_2.1                 truncnorm_1.0-8         
 [64] AnnotationForge_1.32.0   httr_1.4.2               gplots_3.1.1            
 [67] RColorBrewer_1.1-2       ellipsis_0.3.2           pkgconfig_2.0.3         
 [70] XML_3.99-0.6             dbplyr_2.1.1             locfit_1.5-9.4          
 [73] utf8_1.2.1               tidyselect_1.1.1         rlang_0.4.11            
 [76] AnnotationDbi_1.52.0     munsell_0.5.0            tools_4.0.3             
 [79] cachem_1.0.4             generics_0.1.0           RSQLite_2.2.5           
 [82] evaluate_0.14            stringr_1.4.0            fastmap_1.1.0           
 [85] yaml_2.2.1               bit64_4.0.5              caTools_1.18.2          
 [88] purrr_0.3.4              RBGL_1.66.0              xml2_1.3.2              
 [91] biomaRt_2.46.3           compiler_4.0.3           curl_4.3.1              
 [94] png_0.1-7                geneplotter_1.68.0       tibble_3.1.1            
 [97] stringi_1.5.3            GenomicFeatures_1.42.2   lattice_0.20-44         
[100] IRdisplay_1.0            Matrix_1.3-2             vctrs_0.3.8             
[103] pillar_1.6.0             lifecycle_1.0.0          irlba_2.3.3             
[106] data.table_1.14.0        bitops_1.0-7             rtracklayer_1.50.0      
[109] R6_2.5.0                 latticeExtra_0.6-29      hwriter_1.3.2           
[112] ShortRead_1.48.0         KernSmooth_2.23-18       MASS_7.3-54             
[115] gtools_3.8.2             assertthat_0.2.1         DESeq2_1.30.1           
[118] openssl_1.4.4            Category_2.56.0          rjson_0.2.20            
[121] withr_2.4.2              GenomicAlignments_1.26.0 batchtools_0.9.15       
[124] Rsamtools_2.6.0          GenomeInfoDbData_1.2.4   hms_1.0.0               
[127] grid_4.0.3               IRkernel_1.1.1           DOT_0.1                 
[130] coda_0.19-4              GreyListChIP_1.22.0      ashr_2.2-47             
[133] mixsqp_0.3-43            pbdZMQ_0.3-5             bbmle_1.0.23.1          
[136] numDeriv_2016.8-1.1      base64enc_0.1-3
DiffBind • 1.5k views
ADD COMMENT
0
Entering edit mode

If you could give me access to your DBA object after calling dba.count() (eg download link) I can take a look at what is going on -- I'd definitely like to get to the bottom of this!

ADD REPLY
0
Entering edit mode

Sure! Can I send you an email/pm?

ADD REPLY
0
Entering edit mode

Email me at the DiffBind maintainer address.

ADD REPLY
2
Entering edit mode
Rory Stark ★ 5.2k
@rory-stark-5741
Last seen 9 months ago
Cambridge, UK

I have confirmed that this is a bug. The fix will appear in the next release (Bioconductor 2.13/DiffBind_3.2.0) later this week as it is too late to check a fix into the current release.

There is a workaround you can try, which is to call dba.blacklist() prior to calling dba.count(), instead of afterwards. This is the preferred way of blacklisting, as you remove suspect peaks prior to forming a consensus peakset.

If you have a particular reason to run the blacklist after counting reads in the consensus, you can also try running dba() immediately after calling dba.blacklist():

dba <- dba.blacklist(dba)
dba <- dba(dba)
dba.norm <- dba.normalize(dba)

Also, It is not a good idea to call your object dba, the same name as a core DiffBind function, if you can avoid it!

ADD COMMENT
0
Entering edit mode

Thanks, it works!

I did call the object in my script different, but for clarity here I changed it.

best, Ben

ADD REPLY

Login before adding your answer.

Traffic: 621 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6