DiffBind GreyListChIP error: Error: BiocParallel errors
2
1
Entering edit mode
kwangbom ▴ 10
@d1e3382c
Last seen 10 months ago
United States

I'd like to first thank the developers for a fine set of tools. I am performing two-group comparison where each group contains 3~4 treated and input pairs. I am running the analysis on AWS EC2 in which I have installed r-base and DiffBind in a conda environment. I am currently trying to resolve the following issue from dba.analyze:

> h3k27ac <- dba.analyze(h3k27ac)
Applying Blacklist/Greylists...
Genome detected: Hsapiens.NCBI.GRCh38
Applying blacklist...
Removed: 5 of 58439 intervals.
Counting control reads for greylist...
Blacklist error: Error in value[[3L]](cond): GreyListChIP error: Error: BiocParallel errors
  element index: 1, 2, 3, 4, 5, 6, ...
  first error: 'seqlengths' contains NAs or negative values

Applying the blacklist works all right but greylist fails due to BiocParallel errors. From googling, I learned about the following but it did not help at all.

> BiocParallel::register(BiocParallel::SerialParam())

I know dba.analyze(h3k27ac, bGreylist=FALSE) works but, given my input data, greylisting should have no issue from my perspective. I would appreciate any help or insight. Here's my sessionInfo FYI.

> sessionInfo( )
R version 4.1.0 (2021-05-18)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Amazon Linux 2

Matrix products: default
BLAS/LAPACK: /home/ec2-user/miniconda3/envs/R/lib/libopenblasp-r0.3.15.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets
[8] methods   base

other attached packages:
 [1] BiocParallel_1.26.1         DiffBind_3.2.4
 [3] SummarizedExperiment_1.22.0 Biobase_2.52.0
 [5] MatrixGenerics_1.4.0        matrixStats_0.59.0
 [7] GenomicRanges_1.44.0        GenomeInfoDb_1.28.1
 [9] IRanges_2.26.0              S4Vectors_0.30.0
[11] BiocGenerics_0.38.0

loaded via a namespace (and not attached):
  [1] backports_1.2.1          GOstats_2.58.0           BiocFileCache_2.0.0
  [4] plyr_1.8.6               GSEABase_1.54.0          splines_4.1.0
  [7] ggplot2_3.3.5            amap_0.8-18              digest_0.6.27
 [10] invgamma_1.1             GO.db_3.13.0             SQUAREM_2021.1
 [13] fansi_0.5.0              magrittr_2.0.1           checkmate_2.0.0
 [16] memoise_2.0.0            BSgenome_1.60.0          base64url_1.4
 [19] limma_3.48.1             Biostrings_2.60.1        annotate_1.70.0
 [22] systemPipeR_1.26.3       bdsmatrix_1.3-4          prettyunits_1.1.1
 [25] jpeg_0.1-8.1             colorspace_2.0-2         blob_1.2.1
 [28] rappdirs_0.3.3           apeglm_1.14.0            ggrepel_0.9.1
 [31] dplyr_1.0.7              crayon_1.4.1             RCurl_1.98-1.3
 [34] jsonlite_1.7.2           graph_1.70.0             genefilter_1.74.0
 [37] brew_1.0-6               survival_3.2-11          VariantAnnotation_1.38.0
 [40] glue_1.4.2               gtable_0.3.0             zlibbioc_1.38.0
 [43] XVector_0.32.0           DelayedArray_0.18.0      V8_3.4.2
 [46] Rgraphviz_2.36.0         scales_1.1.1             pheatmap_1.0.12
 [49] mvtnorm_1.1-2            DBI_1.1.1                edgeR_3.34.0
 [52] Rcpp_1.0.7               xtable_1.8-4             progress_1.2.2
 [55] emdbook_1.3.12           bit_4.0.4                rsvg_2.1.2
 [58] AnnotationForge_1.34.0   truncnorm_1.0-8          httr_1.4.2
 [61] gplots_3.1.1             RColorBrewer_1.1-2       ellipsis_0.3.2
 [64] pkgconfig_2.0.3          XML_3.99-0.6             dbplyr_2.1.1
 [67] locfit_1.5-9.4           utf8_1.2.1               tidyselect_1.1.1
 [70] rlang_0.4.11             AnnotationDbi_1.54.1     munsell_0.5.0
 [73] tools_4.1.0              cachem_1.0.5             generics_0.1.0
 [76] RSQLite_2.2.5            stringr_1.4.0            fastmap_1.1.0
 [79] yaml_2.2.1               bit64_4.0.5              caTools_1.18.2
 [82] purrr_0.3.4              KEGGREST_1.32.0          RBGL_1.68.0
 [85] xml2_1.3.2               biomaRt_2.48.2           compiler_4.1.0
 [88] rstudioapi_0.13          filelock_1.0.2           curl_4.3.2
 [91] png_0.1-7                geneplotter_1.70.0       tibble_3.1.2
 [94] stringi_1.7.3            GenomicFeatures_1.44.0   lattice_0.20-44
 [97] Matrix_1.3-4             vctrs_0.3.8              pillar_1.6.1
[100] lifecycle_1.0.0          irlba_2.3.3              data.table_1.14.0
[103] bitops_1.0-7             rtracklayer_1.52.0       R6_2.5.0
[106] BiocIO_1.2.0             latticeExtra_0.6-29      hwriter_1.3.2
[109] ShortRead_1.50.0         KernSmooth_2.23-20       MASS_7.3-54
[112] gtools_3.9.2             assertthat_0.2.1         DESeq2_1.32.0
[115] Category_2.58.0          rjson_0.2.20             withr_2.4.2
[118] GenomicAlignments_1.28.0 batchtools_0.9.15        Rsamtools_2.8.0
[121] GenomeInfoDbData_1.2.6   hms_1.1.0                grid_4.1.0
[124] DOT_0.1                  coda_0.19-4              GreyListChIP_1.24.0
[127] ashr_2.2-47              mixsqp_0.3-43            bbmle_1.0.23.1
[130] numDeriv_2016.8-1.1      restfulr_0.0.13
DiffBind • 1.2k views
ADD COMMENT
0
Entering edit mode
Rory Stark ★ 4.4k
@rory-stark-5741
Last seen 11 days ago
CRUK, Cambridge, UK

This error is coming from the GreyListChIP package. I have just taken ownership of that package but am still learning the code so please bear with me.

I assume you are running dba.analyze() after running dba.count()? If so, one thing to try is to run dba.blacklist() before running dba.count():

h3k27ac <- dba(sampleSheet="mysamples.csv")
h3k27ac <- dba.blacklist(h3k27ac)
h3k27ac <- dba.count(h3k27ac)
h3k27ac <- dba.analyze(h3k27ac)

Do this change things? It really should work both before or after calling dba.count() so if calling it first fixes things, I'd still like to get to the bottom of your issue. I see that there is some issue with bad seqlenths. I suspect some mistmatch between the If we can narrow this down to one control .bam file. I should be able to debug it if you can provide me access to that bam file and a copy of your h3k27ac object.

You are on the right track in trying to run it serially to see the error messages, however the best way to to this is to specify cores=1 in the call to dba.blacklst():

 h3k27ac <- dba.blacklist(h3k27ac, cores=1)
ADD COMMENT
1
Entering edit mode

Hi Rory,

I am having this exact same problem and wonder if you had any other suggestions. I get the same error whether or not I run the blacklist before or after counting. I have my error below and also sessionInfo, but happy to provide more. My BAM files are from an alignment to GrCH37 and so I though this might be the issue - except that the blacklist functionality works fine with hg19..

dbObj <- dba.blacklist(dbObj, blacklist=FALSE, greylist="BSgenome.Hsapiens.UCSC.hg19", cores=1)

    Genome detected: Hsapiens.UCSC.hg19
    Counting control reads for greylist...
    Error in value[[3L]](cond) : 
      GreyListChIP error: Error: BiocParallel errors
      element index: 1, 2, 3, 4, 5, 6
      first error: 'seqlengths' contains NAs or negative values



> sessionInfo()
R version 4.1.1 (2021-08-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /n/app/openblas/0.2.19/lib/libopenblas_core2p-r0.2.19.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] DiffBind_3.2.5              SummarizedExperiment_1.22.0
 [3] Biobase_2.52.0              MatrixGenerics_1.4.2       
 [5] matrixStats_0.60.1          GenomicRanges_1.44.0       
 [7] GenomeInfoDb_1.28.1         IRanges_2.26.0             
 [9] S4Vectors_0.30.0            BiocGenerics_0.38.0        

loaded via a namespace (and not attached):
  [1] backports_1.2.1          GOstats_2.58.0           BiocFileCache_2.0.0     
  [4] plyr_1.8.6               GSEABase_1.54.0          splines_4.1.1           
  [7] BiocParallel_1.26.2      ggplot2_3.3.5            amap_0.8-18             
 [10] digest_0.6.27            invgamma_1.1             GO.db_3.13.0            
 [13] SQUAREM_2021.1           fansi_0.5.0              magrittr_2.0.1          
 [16] checkmate_2.0.0          memoise_2.0.0            BSgenome_1.60.0         
 [19] base64url_1.4            limma_3.48.3             Biostrings_2.60.2       
 [22] annotate_1.70.0          systemPipeR_1.26.3       bdsmatrix_1.3-4         
 [25] prettyunits_1.1.1        jpeg_0.1-9               colorspace_2.0-2        
 [28] blob_1.2.2               rappdirs_0.3.3           apeglm_1.14.0           
 [31] ggrepel_0.9.1            dplyr_1.0.7              crayon_1.4.1            
 [34] RCurl_1.98-1.4           jsonlite_1.7.2           graph_1.70.0            
 [37] genefilter_1.74.0        brew_1.0-6               survival_3.2-11         
 [40] VariantAnnotation_1.38.0 glue_1.4.2               gtable_0.3.0            
 [43] zlibbioc_1.38.0          XVector_0.32.0           DelayedArray_0.18.0     
 [46] V8_3.4.2                 Rgraphviz_2.36.0         scales_1.1.1            
 [49] pheatmap_1.0.12          mvtnorm_1.1-2            DBI_1.1.1               
 [52] edgeR_3.34.0             Rcpp_1.0.7               xtable_1.8-4            
 [55] progress_1.2.2           emdbook_1.3.12           bit_4.0.4               
 [58] rsvg_2.1.2               AnnotationForge_1.34.0   truncnorm_1.0-8         
 [61] httr_1.4.2               gplots_3.1.1             RColorBrewer_1.1-2      
 [64] ellipsis_0.3.2           pkgconfig_2.0.3          XML_3.99-0.7            
 [67] dbplyr_2.1.1             locfit_1.5-9.4           utf8_1.2.2              
 [70] tidyselect_1.1.1         rlang_0.4.11             AnnotationDbi_1.54.1    
 [73] munsell_0.5.0            tools_4.1.1              cachem_1.0.6            
 [76] generics_0.1.0           RSQLite_2.2.8            stringr_1.4.0           
 [79] fastmap_1.1.0            yaml_2.2.1               bit64_4.0.5             
 [82] caTools_1.18.2           purrr_0.3.4              KEGGREST_1.32.0         
 [85] RBGL_1.68.0              xml2_1.3.2               biomaRt_2.48.3          
 [88] compiler_4.1.1           rstudioapi_0.13          filelock_1.0.2          
 [91] curl_4.3.2               png_0.1-7                tibble_3.1.4            
 [94] stringi_1.7.4            GenomicFeatures_1.44.1   lattice_0.20-44         
 [97] Matrix_1.3-4             vctrs_0.3.8              pillar_1.6.2            
[100] lifecycle_1.0.0          irlba_2.3.3              data.table_1.14.0       
[103] bitops_1.0-7             rtracklayer_1.52.1       R6_2.5.1                
[106] BiocIO_1.2.0             latticeExtra_0.6-29      hwriter_1.3.2           
[109] ShortRead_1.50.0         KernSmooth_2.23-20       MASS_7.3-54             
[112] gtools_3.9.2             assertthat_0.2.1         Category_2.58.0         
[115] rjson_0.2.20             withr_2.4.2              GenomicAlignments_1.28.0
[118] batchtools_0.9.15        Rsamtools_2.8.0          GenomeInfoDbData_1.2.6  
[121] hms_1.1.0                grid_4.1.1               DOT_0.1                 
[124] coda_0.19-4              GreyListChIP_1.24.0      ashr_2.2-47             
[127] mixsqp_0.3-43            bbmle_1.0.24             numDeriv_2016.8-1.1     
[130] restfulr_0.0.13   
ADD REPLY
1
Entering edit mode

Hello my friends, I have exactly the same issue. May I ask if you have already solve this problem? I have tried library(BiocParallel) register(SerialParam()) but it did not work for me.

Hi Rory let me know if you think this is a problem with control bam files. I can show you the bam files if you have time. Thank you so much!

h3k27ac <- dba(sampleSheet=read.csv("h3k27ac.csv"))
h3k27ac <- dba.blacklist(h3k27ac)

Genome detected: Hsapiens.UCSC.hg38
Applying blacklist...
Removed: 268 of 119013 intervals.
Counting control reads for greylist...
Error in value[3L] :
GreyListChIP error: Error: BiocParallel errors. element index: 1, 2, 3, 4, 5, 6, ...
first error: 'seqlengths' contains NAs or negative values

ADD REPLY
0
Entering edit mode

Could you let me know the versions? Output of sessionInfo().

ADD REPLY
0
Entering edit mode

Hi Rory,
Thank you for your reply! Here is the my sessionInfo():
Please let me know what you think. I really appreciate your help!

sessionInfo()

R version 4.1.2 (2021-11-01) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 20.04.3 LTS

Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets methods base

other attached packages: [1] BiocParallel_1.26.2 forcats_0.5.1 stringr_1.4.0 dplyr_1.0.9
[5] purrr_0.3.4 readr_2.1.2 tidyr_1.2.0 tibble_3.1.7
[9] ggplot2_3.3.6 tidyverse_1.3.1 DiffBind_3.2.7 SummarizedExperiment_1.22.0 [13] Biobase_2.52.0 MatrixGenerics_1.4.3 matrixStats_0.62.0 GenomicRanges_1.44.0
[17] GenomeInfoDb_1.28.4 IRanges_2.26.0 S4Vectors_0.30.2 BiocGenerics_0.38.0

loaded via a namespace (and not attached): [1] readxl_1.4.0 backports_1.4.1 GOstats_2.58.0 BiocFileCache_2.0.0 plyr_1.8.7
[6] GSEABase_1.54.0 splines_4.1.2 amap_0.8-18 digest_0.6.29 invgamma_1.1
[11] GO.db_3.13.0 SQUAREM_2021.1 fansi_1.0.3 magrittr_2.0.3 checkmate_2.1.0
[16] memoise_2.0.1 BSgenome_1.60.0 base64url_1.4 tzdb_0.3.0 limma_3.48.3
[21] Biostrings_2.60.2 annotate_1.70.0 modelr_0.1.8 systemPipeR_1.26.3 bdsmatrix_1.3-4
[26] prettyunits_1.1.1 jpeg_0.1-9 colorspace_2.0-3 rvest_1.0.2 blob_1.2.3
[31] rappdirs_0.3.3 apeglm_1.14.0 ggrepel_0.9.1 haven_2.5.0 crayon_1.5.1
[36] RCurl_1.98-1.6 jsonlite_1.8.0 graph_1.70.0 genefilter_1.74.1 brew_1.0-7
[41] survival_3.3-1 VariantAnnotation_1.38.0 glue_1.6.2 gtable_0.3.0 zlibbioc_1.38.0
[46] XVector_0.32.0 DelayedArray_0.18.0 V8_4.2.0 Rgraphviz_2.36.0 scales_1.2.0
[51] pheatmap_1.0.12 mvtnorm_1.1-3 DBI_1.1.2 edgeR_3.34.1 Rcpp_1.0.8.3
[56] xtable_1.8-4 progress_1.2.2 emdbook_1.3.12 bit_4.0.4 rsvg_2.3.1
[61] AnnotationForge_1.34.1 truncnorm_1.0-8 httr_1.4.3 gplots_3.1.3 RColorBrewer_1.1-3
[66] ellipsis_0.3.2 pkgconfig_2.0.3 XML_3.99-0.9 dbplyr_2.1.1 locfit_1.5-9.5
[71] utf8_1.2.2 tidyselect_1.1.2 rlang_1.0.2 AnnotationDbi_1.54.1 cellranger_1.1.0
[76] munsell_0.5.0 tools_4.1.2 cachem_1.0.6 cli_3.3.0 generics_0.1.2
[81] RSQLite_2.2.14 broom_0.8.0 fastmap_1.1.0 yaml_2.3.5 fs_1.5.2
[86] bit64_4.0.5 caTools_1.18.2 KEGGREST_1.32.0 RBGL_1.68.0 xml2_1.3.3
[91] biomaRt_2.48.3 debugme_1.1.0 compiler_4.1.2 rstudioapi_0.13 filelock_1.0.2
[96] curl_4.3.2 png_0.1-7 reprex_2.0.1 stringi_1.7.6 GenomicFeatures_1.44.2
[101] lattice_0.20-45 Matrix_1.4-0 vctrs_0.4.1 pillar_1.7.0 lifecycle_1.0.1
[106] irlba_2.3.5 data.table_1.14.2 bitops_1.0-7 rtracklayer_1.52.1 R6_2.5.1
[111] BiocIO_1.2.0 latticeExtra_0.6-29 hwriter_1.3.2.1 ShortRead_1.50.0 KernSmooth_2.23-20
[116] MASS_7.3-55 gtools_3.9.2 assertthat_0.2.1 Category_2.58.0 rjson_0.2.21
[121] withr_2.5.0 GenomicAlignments_1.28.0 batchtools_0.9.15 Rsamtools_2.8.0 GenomeInfoDbData_1.2.6
[126] hms_1.1.1 grid_4.1.2 DOT_0.1 coda_0.19-4 GreyListChIP_1.24.0
[131] ashr_2.2-54 mixsqp_0.3-43 bbmle_1.0.25 lubridate_1.8.0 numDeriv_2016.8-1.1
[136] restfulr_0.0.13

ADD REPLY
0
Entering edit mode

The latest version, DiffBind_3.6.1, contains fixes for these issues. You will get these fixes if you upgrade to R_4.2.0 and Bioconductor_3.15.

ADD REPLY
0
Entering edit mode

I am still getting an error in DiffBind_3.6.1:

> dba <- dba.blacklist(dba)
Genome detected: Hsapiens.UCSC.hg38
Applying blacklist...
Removed: 142 of 41406 intervals.
Counting control reads for greylist...
Error in value[[3L]](cond) : 
  GreyListChIP error: Error: BiocParallel errors
  3 remote errors, element index: 1, 2, 3
  0 unevaluated and other errors
  first remote error:
Error in h(simpleError(msg, call)): error in evaluating the argument 'BPPARAM' in selecting a method for function 'bplapply': 'list' object cannot be coerced to type 'integer'
> package.version("DiffBind")
[1] "3.6.1"

Seems like it might be a BPPARAM issue though...

> register(BPPARAM = SerialParam())
Error in .registry_init() : 
  'list' object cannot be coerced to type 'integer'
ADD REPLY
0
Entering edit mode

Found the problem: link

You can overcome this issue by setting mc.cores:

> options(mc.cores = 48)
> BiocParallel:::.detectCores()
[1] 48
> dba <- dba.analyze(dba)
Applying Blacklist/Greylists...
Genome detected: Hsapiens.UCSC.hg38
Applying blacklist...
Removed: 142 of 41406 intervals.
Counting control reads for greylist...
Building greylist: data/drip_seq/rlpipes_out/bam/EUFA_input_S41_L004/EUFA_input_S41_L004_hg38.bam
coverage: 803268 bp (0.03%)
Building greylist: data/drip_seq/rlpipes_out/bam/EUFA_BRCA2_input_S42_L004/EUFA_BRCA2_input_S42_L004_hg38.bam
coverage: 939138 bp (0.03%)
Building greylist: data/drip_seq/rlpipes_out/bam/EUFA_PAF_Input_S33/EUFA_PAF_Input_S33_hg38.bam
coverage: 761211 bp (0.02%)
Control1: 99 ranges, 803268 bases
Control2: 92 ranges, 939138 bases
Control3: 105 ranges, 761211 bases
Master greylist: 136 ranges, 1219749 bases
Removed: 64 of 41264 intervals.
Removed 206 (of 41406) consensus peaks.
Normalize DESeq2 with defaults...
Analyzing...
>
ADD REPLY
0
Entering edit mode
XPSun • 0
@c0ba434a
Last seen 5 months ago
United States

The issue could be that the BiocParallel is not registered. Try library(BiocParallel) Thenregister(SerialParam()) and run the blacklist function.

ADD COMMENT

Login before adding your answer.

Traffic: 226 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6