DiffBind error with dba.count
6
0
Entering edit mode
hhhcce • 0
@hhhcce-15490
Last seen 7.0 years ago

I tried running the dba.count function and got the following error message? The bam files seem to be fine.


> chip_seq=chip_seq(tamoxifen,summit=250) Error: Error processing one or more read files. Check warnings(). In addition: There were 28 warnings (use warnings() to see them)

> warnings() Warning messages: 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28:

 

diffbind • 1.9k views
ADD COMMENT
0
Entering edit mode
hhhcce • 0
@hhhcce-15490
Last seen 7.0 years ago

> sessionInfo() R version 3.4.4 (2018-03-15) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 16.04.4 LTS

Matrix products: default BLAS: /usr/lib/openblas-base/libblas.so.3 LAPACK: /usr/lib/libopenblasp-r0.2.18.so

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets [8] methods base

other attached packages: [1] BiocInstaller_1.28.0 DiffBind_2.6.6 [3] SummarizedExperiment_1.8.1 DelayedArray_0.4.1 [5] matrixStats_0.53.1 Biobase_2.38.0 [7] GenomicRanges_1.30.3 GenomeInfoDb_1.14.0 [9] IRanges_2.12.0 S4Vectors_0.16.0 [11] BiocGenerics_0.24.0

 

 

ADD COMMENT
0
Entering edit mode
hhhcce • 0
@hhhcce-15490
Last seen 7.0 years ago

loaded via a namespace (and not attached): [1] Category_2.44.0 bitops_1.0-6 bit64_0.9-7 [4] RColorBrewer_1.1-2 progress_1.1.2 httr_1.3.1 [7] Rgraphviz_2.22.0 tools_3.4.4 backports_1.1.2 [10] R6_2.2.2 KernSmooth_2.23-15 DBI_0.8 [13] lazyeval_0.2.1 colorspace_1.3-2 prettyunits_1.0.2 [16] RMySQL_0.10.14 bit_1.1-12 compiler_3.4.4 [19] sendmailR_1.2-1 graph_1.56.0 rtracklayer_1.38.3 [22] caTools_1.17.1 scales_0.5.0 checkmate_1.8.5 [25] BatchJobs_1.7 genefilter_1.60.0 RBGL_1.54.0 [28] stringr_1.3.0 digest_0.6.15 Rsamtools_1.30.0 [31] AnnotationForge_1.20.0 XVector_0.18.0 base64enc_0.1-3 [34] pkgconfig_2.0.1 limma_3.34.9 rlang_0.2.0 [37] RSQLite_2.1.0 BBmisc_1.11 bindr_0.1.1 [40] GOstats_2.44.0 hwriter_1.3.2 BiocParallel_1.12.0 [43] gtools_3.5.0 dplyr_0.7.4 RCurl_1.95-4.10 [46] magrittr_1.5 GO.db_3.5.0 GenomeInfoDbData_1.0.0 [49] Matrix_1.2-11 Rcpp_0.12.16 munsell_0.4.3 [52] stringi_1.1.7 edgeR_3.20.9 zlibbioc_1.24.0 [55] gplots_3.0.1 plyr_1.8.4 grid_3.4.4 [58] blob_1.1.1 ggrepel_0.7.0 gdata_2.18.0 [61] lattice_0.20-35 Biostrings_2.46.0 splines_3.4.4 [64] GenomicFeatures_1.30.3 annotate_1.56.2 locfit_1.5-9.1 [67] pillar_1.2.1 rjson_0.2.15 systemPipeR_1.12.0
 

ADD COMMENT
0
Entering edit mode
hhhcce • 0
@hhhcce-15490
Last seen 7.0 years ago
[70] biomaRt_2.34.2           glue_1.2.0               XML_3.98-1.10           
[73] ShortRead_1.36.1         latticeExtra_0.6-28      data.table_1.10.4-3     
[76] gtable_0.2.0             amap_0.8-14              assertthat_0.2.0        
[79] ggplot2_2.2.1            xtable_1.8-2             survival_2.41-3         
[82] tibble_1.4.2             pheatmap_1.0.8           GenomicAlignments_1.14.2
[85] AnnotationDbi_1.40.0     memoise_1.1.0            bindrcpp_0.2.2          
[88] brew_1.0-6               GSEABase_1.40.1 
ADD COMMENT
0
Entering edit mode
Rory Stark ★ 5.2k
@rory-stark-5741
Last seen 12 weeks ago
Cambridge, UK

I'm not sure why the warning messages are blank. One thing to try is to run dba.count() in serial mode by setting bParallel=FALSE. This may make the specific warnings more explicit.

When we see this error, one of the following problems is usually the cause:

  • The files aren't really where you think they are. You can test this as follows: 
> file_test("-f",myDBA$class[10,]) #should return all TRUE values if the filepaths are correct.
  • There are no matching .bai files for each .bam file. You can test this as follows: 
> file_test("-f",gsub(".bam",".bam.bai",myDBA$class[10,])) #should return all TRUE values
  • There is something wrong with the format of the bam file, you can check consistency eg. with bamtools flagstat
ADD COMMENT
0
Entering edit mode
hhhcce • 0
@hhhcce-15490
Last seen 7.0 years ago

Dear Rory,

I added bParallel=FALSE to the code and it seems to work. The bam files can now be loaded. I checked the bam files using samtools flagstat and your codes and the bam files seem to be fine.

However, I stumbled upon another problem when I started running dba.count. The process was killed (probably due to high memory usage). I tried setting $config$yieldSize=50000 and bUseSummarizeOverlaps=TRUE but it didn't seem to help.

> Ngn2=dba.count(tamoxifen,minOverlap=2,bParallel=FALSE,bUseSummarizeOverlaps=TRUE) Sample: novirus-ATCACG_S1_R1_001.sorted.rmdup.bam125 Sample: novirus-GATCAG_S9_R1_001.sorted.rmdup.bam125 Sample: Ngn2-2d-GGCTAC_combined.sorted.rmdup.bam125 Sample: Ngn2-2d-TTAGGC_combined.sorted.rmdup.bam125 Sample: SL-6d-ACAGTG_combined.sorted.rmdup.bam125 Sample: SL-6d-AGTCAA_combined.sorted.rmdup.bam125 Sample: SL-6d-B-GTCCGC_S2_R1_001.sorted.rmdup.bam125 Sample: SL1-GTCCGC_S1_R1_001.sorted.rmdup.bam125 Sample: SL2-GTGGCC_S3_R1_001.sorted.rmdup.bam125 Sample: SLC-6d-ATGTCA_combined.sorted.rmdup.bam125 Sample: SLC-6d-CAGATC_combined.sorted.rmdup.bam125 Sample: SLC-6d-GTGAAA_S3_R1_001.sorted.rmdup.bam125 Sample: SLC1-GTGAAA_S2_R1_001.sorted.rmdup.bam125 Sample: SLC2-GTTTCG_S4_R1_001.sorted.rmdup.bam125 Sample: novirus-input-CGATGT_S2_R1_001.sorted.rmdup.bam125 Sample: novirus-input-TAGCTT_S10_R1_001.sorted.rmdup.bam125 Sample: Ngn2-2d-input-CTTGTA_S12_R1_001.sorted.rmdup.bam125 Sample: Ngn2-2d-input-TGACCA_S4_R1_001.sorted.rmdup.bam125 Sample: SL-6d-input-AGTTCC_combined.sorted.rmdup.bam125 Sample: SL-6d-input-GCCAAT_S6_R1_001.sorted.rmdup.bam125 Sample: SL-Input-6d-GTGGCC_S4_R1_001.sorted.rmdup.bam125 Sample: SL1-input-CGTACG_S5_R1_001.sorted.rmdup.bam125 Sample: SL2-input-ACTGAT_S7_R1_001.sorted.rmdup.bam125 Sample: SLC-6d-input-ACTTGA_S8_R1_001.sorted.rmdup.bam125 Sample: SLC-6d-input-CCGTCC_combined.sorted.rmdup.bam125 Sample: SLC-Input-6d-GTTTCG_S5_R1_001.sorted.rmdup.bam125 Sample: SLC1-input-GAGTGG_S6_R1_001.sorted.rmdup.bam125 Sample: SLC2-input-ATTCCT_S8_R1_001.sorted.rmdup.bam125 Killed

ADD COMMENT
0
Entering edit mode
Rory Stark ★ 5.2k
@rory-stark-5741
Last seen 12 weeks ago
Cambridge, UK

ahh, i suspect this was the real problem all along. The memory issue isn't happening during the counting (so yieldSize/summarizeOverlaps won't help), but in the construction of the binding matrix.

How many peaks are there in the consensus set? Before you run dba.count(), if you print out the DBA object, it will tell you how many sites, I wonder if this is a very high number? The binding matrix will be that many rows x 28 columns, there needs to be enough memory for that. 

ADD COMMENT

Login before adding your answer.

Traffic: 1059 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6