DiffBind error with dba.count
6
0
Entering edit mode
hhhcce • 0
@hhhcce-15490
Last seen 6.7 years ago

I tried running the dba.count function and got the following error message? The bam files seem to be fine.


> chip_seq=chip_seq(tamoxifen,summit=250)
Error: Error processing one or more read files. Check warnings().
In addition: There were 28 warnings (use warnings() to see them)

> warnings()
Warning messages:
1: 
2: 
3: 
4: 
5: 
6: 
7: 
8: 
9: 
10: 
11: 
12: 
13: 
14: 
15: 
16: 
17: 
18: 
19: 
20: 
21: 
22: 
23: 
24: 
25: 
26: 
27: 
28: 

 

diffbind • 1.8k views
ADD COMMENT
0
Entering edit mode
hhhcce • 0
@hhhcce-15490
Last seen 6.7 years ago

> sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.4 LTS

Matrix products: default
BLAS: /usr/lib/openblas-base/libblas.so.3
LAPACK: /usr/lib/libopenblasp-r0.2.18.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] BiocInstaller_1.28.0       DiffBind_2.6.6            
 [3] SummarizedExperiment_1.8.1 DelayedArray_0.4.1        
 [5] matrixStats_0.53.1         Biobase_2.38.0            
 [7] GenomicRanges_1.30.3       GenomeInfoDb_1.14.0       
 [9] IRanges_2.12.0             S4Vectors_0.16.0          
[11] BiocGenerics_0.24.0       

 

 

ADD COMMENT
0
Entering edit mode
hhhcce • 0
@hhhcce-15490
Last seen 6.7 years ago

loaded via a namespace (and not attached):
 [1] Category_2.44.0          bitops_1.0-6             bit64_0.9-7             
 [4] RColorBrewer_1.1-2       progress_1.1.2           httr_1.3.1              
 [7] Rgraphviz_2.22.0         tools_3.4.4              backports_1.1.2         
[10] R6_2.2.2                 KernSmooth_2.23-15       DBI_0.8                 
[13] lazyeval_0.2.1           colorspace_1.3-2         prettyunits_1.0.2       
[16] RMySQL_0.10.14           bit_1.1-12               compiler_3.4.4          
[19] sendmailR_1.2-1          graph_1.56.0             rtracklayer_1.38.3      
[22] caTools_1.17.1           scales_0.5.0             checkmate_1.8.5         
[25] BatchJobs_1.7            genefilter_1.60.0        RBGL_1.54.0             
[28] stringr_1.3.0            digest_0.6.15            Rsamtools_1.30.0        
[31] AnnotationForge_1.20.0   XVector_0.18.0           base64enc_0.1-3         
[34] pkgconfig_2.0.1          limma_3.34.9             rlang_0.2.0             
[37] RSQLite_2.1.0            BBmisc_1.11              bindr_0.1.1             
[40] GOstats_2.44.0           hwriter_1.3.2            BiocParallel_1.12.0     
[43] gtools_3.5.0             dplyr_0.7.4              RCurl_1.95-4.10         
[46] magrittr_1.5             GO.db_3.5.0              GenomeInfoDbData_1.0.0  
[49] Matrix_1.2-11            Rcpp_0.12.16             munsell_0.4.3           
[52] stringi_1.1.7            edgeR_3.20.9             zlibbioc_1.24.0         
[55] gplots_3.0.1             plyr_1.8.4               grid_3.4.4              
[58] blob_1.1.1               ggrepel_0.7.0            gdata_2.18.0            
[61] lattice_0.20-35          Biostrings_2.46.0        splines_3.4.4           
[64] GenomicFeatures_1.30.3   annotate_1.56.2          locfit_1.5-9.1          
[67] pillar_1.2.1             rjson_0.2.15             systemPipeR_1.12.0      

 

ADD COMMENT
0
Entering edit mode
hhhcce • 0
@hhhcce-15490
Last seen 6.7 years ago
[70] biomaRt_2.34.2           glue_1.2.0               XML_3.98-1.10           
[73] ShortRead_1.36.1         latticeExtra_0.6-28      data.table_1.10.4-3     
[76] gtable_0.2.0             amap_0.8-14              assertthat_0.2.0        
[79] ggplot2_2.2.1            xtable_1.8-2             survival_2.41-3         
[82] tibble_1.4.2             pheatmap_1.0.8           GenomicAlignments_1.14.2
[85] AnnotationDbi_1.40.0     memoise_1.1.0            bindrcpp_0.2.2          
[88] brew_1.0-6               GSEABase_1.40.1 
ADD COMMENT
0
Entering edit mode
Rory Stark ★ 5.2k
@rory-stark-5741
Last seen 5 weeks ago
Cambridge, UK

I'm not sure why the warning messages are blank. One thing to try is to run dba.count() in serial mode by setting bParallel=FALSE. This may make the specific warnings more explicit.

When we see this error, one of the following problems is usually the cause:

  • The files aren't really where you think they are. You can test this as follows: 
> file_test("-f",myDBA$class[10,]) #should return all TRUE values if the filepaths are correct.
  • There are no matching .bai files for each .bam file. You can test this as follows: 
> file_test("-f",gsub(".bam",".bam.bai",myDBA$class[10,])) #should return all TRUE values
  • There is something wrong with the format of the bam file, you can check consistency eg. with bamtools flagstat
ADD COMMENT
0
Entering edit mode
hhhcce • 0
@hhhcce-15490
Last seen 6.7 years ago

Dear Rory,

I added bParallel=FALSE to the code and it seems to work. The bam files can now be loaded. I checked the bam files using samtools flagstat and your codes and the bam files seem to be fine.

However, I stumbled upon another problem when I started running dba.count. The process was killed (probably due to high memory usage). I tried setting $config$yieldSize=50000 and bUseSummarizeOverlaps=TRUE but it didn't seem to help.

> Ngn2=dba.count(tamoxifen,minOverlap=2,bParallel=FALSE,bUseSummarizeOverlaps=TRUE)
Sample: novirus-ATCACG_S1_R1_001.sorted.rmdup.bam125
Sample: novirus-GATCAG_S9_R1_001.sorted.rmdup.bam125
Sample: Ngn2-2d-GGCTAC_combined.sorted.rmdup.bam125
Sample: Ngn2-2d-TTAGGC_combined.sorted.rmdup.bam125
Sample: SL-6d-ACAGTG_combined.sorted.rmdup.bam125
Sample: SL-6d-AGTCAA_combined.sorted.rmdup.bam125
Sample: SL-6d-B-GTCCGC_S2_R1_001.sorted.rmdup.bam125
Sample: SL1-GTCCGC_S1_R1_001.sorted.rmdup.bam125
Sample: SL2-GTGGCC_S3_R1_001.sorted.rmdup.bam125
Sample: SLC-6d-ATGTCA_combined.sorted.rmdup.bam125
Sample: SLC-6d-CAGATC_combined.sorted.rmdup.bam125
Sample: SLC-6d-GTGAAA_S3_R1_001.sorted.rmdup.bam125
Sample: SLC1-GTGAAA_S2_R1_001.sorted.rmdup.bam125
Sample: SLC2-GTTTCG_S4_R1_001.sorted.rmdup.bam125
Sample: novirus-input-CGATGT_S2_R1_001.sorted.rmdup.bam125
Sample: novirus-input-TAGCTT_S10_R1_001.sorted.rmdup.bam125
Sample: Ngn2-2d-input-CTTGTA_S12_R1_001.sorted.rmdup.bam125
Sample: Ngn2-2d-input-TGACCA_S4_R1_001.sorted.rmdup.bam125
Sample: SL-6d-input-AGTTCC_combined.sorted.rmdup.bam125
Sample: SL-6d-input-GCCAAT_S6_R1_001.sorted.rmdup.bam125
Sample: SL-Input-6d-GTGGCC_S4_R1_001.sorted.rmdup.bam125
Sample: SL1-input-CGTACG_S5_R1_001.sorted.rmdup.bam125
Sample: SL2-input-ACTGAT_S7_R1_001.sorted.rmdup.bam125
Sample: SLC-6d-input-ACTTGA_S8_R1_001.sorted.rmdup.bam125
Sample: SLC-6d-input-CCGTCC_combined.sorted.rmdup.bam125
Sample: SLC-Input-6d-GTTTCG_S5_R1_001.sorted.rmdup.bam125
Sample: SLC1-input-GAGTGG_S6_R1_001.sorted.rmdup.bam125
Sample: SLC2-input-ATTCCT_S8_R1_001.sorted.rmdup.bam125
Killed

ADD COMMENT
0
Entering edit mode
Rory Stark ★ 5.2k
@rory-stark-5741
Last seen 5 weeks ago
Cambridge, UK

ahh, i suspect this was the real problem all along. The memory issue isn't happening during the counting (so yieldSize/summarizeOverlaps won't help), but in the construction of the binding matrix.

How many peaks are there in the consensus set? Before you run dba.count(), if you print out the DBA object, it will tell you how many sites, I wonder if this is a very high number? The binding matrix will be that many rows x 28 columns, there needs to be enough memory for that. 

ADD COMMENT

Login before adding your answer.

Traffic: 781 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6