DiffBind GreyListChIP error: Error: BiocParallel errors
1
0
Entering edit mode
kwangbom • 0
@d1e3382c
Last seen 7 weeks ago
United States

I'd like to first thank the developers for a fine set of tools. I am performing two-group comparison where each group contains 3~4 treated and input pairs. I am running the analysis on AWS EC2 in which I have installed r-base and DiffBind in a conda environment. I am currently trying to resolve the following issue from dba.analyze:

> h3k27ac <- dba.analyze(h3k27ac)
Applying Blacklist/Greylists...
Genome detected: Hsapiens.NCBI.GRCh38
Applying blacklist...
Removed: 5 of 58439 intervals.
Counting control reads for greylist...
Blacklist error: Error in value[[3L]](cond): GreyListChIP error: Error: BiocParallel errors
  element index: 1, 2, 3, 4, 5, 6, ...
  first error: 'seqlengths' contains NAs or negative values

Applying the blacklist works all right but greylist fails due to BiocParallel errors. From googling, I learned about the following but it did not help at all.

> BiocParallel::register(BiocParallel::SerialParam())

I know dba.analyze(h3k27ac, bGreylist=FALSE) works but, given my input data, greylisting should have no issue from my perspective. I would appreciate any help or insight. Here's my sessionInfo FYI.

> sessionInfo( )
R version 4.1.0 (2021-05-18)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Amazon Linux 2

Matrix products: default
BLAS/LAPACK: /home/ec2-user/miniconda3/envs/R/lib/libopenblasp-r0.3.15.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets
[8] methods   base

other attached packages:
 [1] BiocParallel_1.26.1         DiffBind_3.2.4
 [3] SummarizedExperiment_1.22.0 Biobase_2.52.0
 [5] MatrixGenerics_1.4.0        matrixStats_0.59.0
 [7] GenomicRanges_1.44.0        GenomeInfoDb_1.28.1
 [9] IRanges_2.26.0              S4Vectors_0.30.0
[11] BiocGenerics_0.38.0

loaded via a namespace (and not attached):
  [1] backports_1.2.1          GOstats_2.58.0           BiocFileCache_2.0.0
  [4] plyr_1.8.6               GSEABase_1.54.0          splines_4.1.0
  [7] ggplot2_3.3.5            amap_0.8-18              digest_0.6.27
 [10] invgamma_1.1             GO.db_3.13.0             SQUAREM_2021.1
 [13] fansi_0.5.0              magrittr_2.0.1           checkmate_2.0.0
 [16] memoise_2.0.0            BSgenome_1.60.0          base64url_1.4
 [19] limma_3.48.1             Biostrings_2.60.1        annotate_1.70.0
 [22] systemPipeR_1.26.3       bdsmatrix_1.3-4          prettyunits_1.1.1
 [25] jpeg_0.1-8.1             colorspace_2.0-2         blob_1.2.1
 [28] rappdirs_0.3.3           apeglm_1.14.0            ggrepel_0.9.1
 [31] dplyr_1.0.7              crayon_1.4.1             RCurl_1.98-1.3
 [34] jsonlite_1.7.2           graph_1.70.0             genefilter_1.74.0
 [37] brew_1.0-6               survival_3.2-11          VariantAnnotation_1.38.0
 [40] glue_1.4.2               gtable_0.3.0             zlibbioc_1.38.0
 [43] XVector_0.32.0           DelayedArray_0.18.0      V8_3.4.2
 [46] Rgraphviz_2.36.0         scales_1.1.1             pheatmap_1.0.12
 [49] mvtnorm_1.1-2            DBI_1.1.1                edgeR_3.34.0
 [52] Rcpp_1.0.7               xtable_1.8-4             progress_1.2.2
 [55] emdbook_1.3.12           bit_4.0.4                rsvg_2.1.2
 [58] AnnotationForge_1.34.0   truncnorm_1.0-8          httr_1.4.2
 [61] gplots_3.1.1             RColorBrewer_1.1-2       ellipsis_0.3.2
 [64] pkgconfig_2.0.3          XML_3.99-0.6             dbplyr_2.1.1
 [67] locfit_1.5-9.4           utf8_1.2.1               tidyselect_1.1.1
 [70] rlang_0.4.11             AnnotationDbi_1.54.1     munsell_0.5.0
 [73] tools_4.1.0              cachem_1.0.5             generics_0.1.0
 [76] RSQLite_2.2.5            stringr_1.4.0            fastmap_1.1.0
 [79] yaml_2.2.1               bit64_4.0.5              caTools_1.18.2
 [82] purrr_0.3.4              KEGGREST_1.32.0          RBGL_1.68.0
 [85] xml2_1.3.2               biomaRt_2.48.2           compiler_4.1.0
 [88] rstudioapi_0.13          filelock_1.0.2           curl_4.3.2
 [91] png_0.1-7                geneplotter_1.70.0       tibble_3.1.2
 [94] stringi_1.7.3            GenomicFeatures_1.44.0   lattice_0.20-44
 [97] Matrix_1.3-4             vctrs_0.3.8              pillar_1.6.1
[100] lifecycle_1.0.0          irlba_2.3.3              data.table_1.14.0
[103] bitops_1.0-7             rtracklayer_1.52.0       R6_2.5.0
[106] BiocIO_1.2.0             latticeExtra_0.6-29      hwriter_1.3.2
[109] ShortRead_1.50.0         KernSmooth_2.23-20       MASS_7.3-54
[112] gtools_3.9.2             assertthat_0.2.1         DESeq2_1.32.0
[115] Category_2.58.0          rjson_0.2.20             withr_2.4.2
[118] GenomicAlignments_1.28.0 batchtools_0.9.15        Rsamtools_2.8.0
[121] GenomeInfoDbData_1.2.6   hms_1.1.0                grid_4.1.0
[124] DOT_0.1                  coda_0.19-4              GreyListChIP_1.24.0
[127] ashr_2.2-47              mixsqp_0.3-43            bbmle_1.0.23.1
[130] numDeriv_2016.8-1.1      restfulr_0.0.13
DiffBind • 176 views
ADD COMMENT
0
Entering edit mode
Rory Stark ★ 4.1k
@rory-stark-5741
Last seen 1 day ago
CRUK, Cambridge, UK

This error is coming from the GreyListChIP package. I have just taken ownership of that package but am still learning the code so please bear with me.

I assume you are running dba.analyze() after running dba.count()? If so, one thing to try is to run dba.blacklist() before running dba.count():

h3k27ac <- dba(sampleSheet="mysamples.csv")
h3k27ac <- dba.blacklist(h3k27ac)
h3k27ac <- dba.count(h3k27ac)
h3k27ac <- dba.analyze(h3k27ac)

Do this change things? It really should work both before or after calling dba.count() so if calling it first fixes things, I'd still like to get to the bottom of your issue. I see that there is some issue with bad seqlenths. I suspect some mistmatch between the If we can narrow this down to one control .bam file. I should be able to debug it if you can provide me access to that bam file and a copy of your h3k27ac object.

You are on the right track in trying to run it serially to see the error messages, however the best way to to this is to specify cores=1 in the call to dba.blacklst():

 h3k27ac <- dba.blacklist(h3k27ac, cores=1)
ADD COMMENT
0
Entering edit mode

Hi Rory,

I am having this exact same problem and wonder if you had any other suggestions. I get the same error whether or not I run the blacklist before or after counting. I have my error below and also sessionInfo, but happy to provide more. My BAM files are from an alignment to GrCH37 and so I though this might be the issue - except that the blacklist functionality works fine with hg19..

dbObj <- dba.blacklist(dbObj, blacklist=FALSE, greylist="BSgenome.Hsapiens.UCSC.hg19", cores=1)

    Genome detected: Hsapiens.UCSC.hg19
    Counting control reads for greylist...
    Error in value[[3L]](cond) : 
      GreyListChIP error: Error: BiocParallel errors
      element index: 1, 2, 3, 4, 5, 6
      first error: 'seqlengths' contains NAs or negative values



> sessionInfo()
R version 4.1.1 (2021-08-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /n/app/openblas/0.2.19/lib/libopenblas_core2p-r0.2.19.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] DiffBind_3.2.5              SummarizedExperiment_1.22.0
 [3] Biobase_2.52.0              MatrixGenerics_1.4.2       
 [5] matrixStats_0.60.1          GenomicRanges_1.44.0       
 [7] GenomeInfoDb_1.28.1         IRanges_2.26.0             
 [9] S4Vectors_0.30.0            BiocGenerics_0.38.0        

loaded via a namespace (and not attached):
  [1] backports_1.2.1          GOstats_2.58.0           BiocFileCache_2.0.0     
  [4] plyr_1.8.6               GSEABase_1.54.0          splines_4.1.1           
  [7] BiocParallel_1.26.2      ggplot2_3.3.5            amap_0.8-18             
 [10] digest_0.6.27            invgamma_1.1             GO.db_3.13.0            
 [13] SQUAREM_2021.1           fansi_0.5.0              magrittr_2.0.1          
 [16] checkmate_2.0.0          memoise_2.0.0            BSgenome_1.60.0         
 [19] base64url_1.4            limma_3.48.3             Biostrings_2.60.2       
 [22] annotate_1.70.0          systemPipeR_1.26.3       bdsmatrix_1.3-4         
 [25] prettyunits_1.1.1        jpeg_0.1-9               colorspace_2.0-2        
 [28] blob_1.2.2               rappdirs_0.3.3           apeglm_1.14.0           
 [31] ggrepel_0.9.1            dplyr_1.0.7              crayon_1.4.1            
 [34] RCurl_1.98-1.4           jsonlite_1.7.2           graph_1.70.0            
 [37] genefilter_1.74.0        brew_1.0-6               survival_3.2-11         
 [40] VariantAnnotation_1.38.0 glue_1.4.2               gtable_0.3.0            
 [43] zlibbioc_1.38.0          XVector_0.32.0           DelayedArray_0.18.0     
 [46] V8_3.4.2                 Rgraphviz_2.36.0         scales_1.1.1            
 [49] pheatmap_1.0.12          mvtnorm_1.1-2            DBI_1.1.1               
 [52] edgeR_3.34.0             Rcpp_1.0.7               xtable_1.8-4            
 [55] progress_1.2.2           emdbook_1.3.12           bit_4.0.4               
 [58] rsvg_2.1.2               AnnotationForge_1.34.0   truncnorm_1.0-8         
 [61] httr_1.4.2               gplots_3.1.1             RColorBrewer_1.1-2      
 [64] ellipsis_0.3.2           pkgconfig_2.0.3          XML_3.99-0.7            
 [67] dbplyr_2.1.1             locfit_1.5-9.4           utf8_1.2.2              
 [70] tidyselect_1.1.1         rlang_0.4.11             AnnotationDbi_1.54.1    
 [73] munsell_0.5.0            tools_4.1.1              cachem_1.0.6            
 [76] generics_0.1.0           RSQLite_2.2.8            stringr_1.4.0           
 [79] fastmap_1.1.0            yaml_2.2.1               bit64_4.0.5             
 [82] caTools_1.18.2           purrr_0.3.4              KEGGREST_1.32.0         
 [85] RBGL_1.68.0              xml2_1.3.2               biomaRt_2.48.3          
 [88] compiler_4.1.1           rstudioapi_0.13          filelock_1.0.2          
 [91] curl_4.3.2               png_0.1-7                tibble_3.1.4            
 [94] stringi_1.7.4            GenomicFeatures_1.44.1   lattice_0.20-44         
 [97] Matrix_1.3-4             vctrs_0.3.8              pillar_1.6.2            
[100] lifecycle_1.0.0          irlba_2.3.3              data.table_1.14.0       
[103] bitops_1.0-7             rtracklayer_1.52.1       R6_2.5.1                
[106] BiocIO_1.2.0             latticeExtra_0.6-29      hwriter_1.3.2           
[109] ShortRead_1.50.0         KernSmooth_2.23-20       MASS_7.3-54             
[112] gtools_3.9.2             assertthat_0.2.1         Category_2.58.0         
[115] rjson_0.2.20             withr_2.4.2              GenomicAlignments_1.28.0
[118] batchtools_0.9.15        Rsamtools_2.8.0          GenomeInfoDbData_1.2.6  
[121] hms_1.1.0                grid_4.1.1               DOT_0.1                 
[124] coda_0.19-4              GreyListChIP_1.24.0      ashr_2.2-47             
[127] mixsqp_0.3-43            bbmle_1.0.24             numDeriv_2016.8-1.1     
[130] restfulr_0.0.13   
ADD REPLY

Login before adding your answer.

Traffic: 504 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6