Hi folks,
I am having a trouble with dba.count function in diffbind. It fails with warnings but without any specific information. Please see below:
> obj <- dba(sampleSheet='analysis_H3K4me3_samples.csv')
S1 Eye A None 1 Bed
S2 Eye A None 2 Bed
S3 Eye B None 1 Bed
S4 Eye B None 2 Bed
S5 Eye C None 1 Bed
S6 Eye C None 2 Bed
> obj
6 Samples, 19333 sites in matrix (23582 total):
ID Tissue Condition Treatment Replicate Caller Intervals
1 S1 Eye A None 1 Bed 17211
2 S2 Eye A None 2 Bed 16219
3 S3 Eye B None 1 Bed 18846
4 S4 Eye B None 2 Bed 19704
5 S5 Eye C None 1 Bed 17983
6 S6 Eye C None 2 Bed 18471
> obj <- dba.count(obj)
Error: Error processing one or more read files. Check warnings().
In addition: Warning messages:
1:
2:
3:
4:
5:
6:
7:
8:
9:
> warnings()
Warning messages:
1:
2:
3:
4:
5:
6:
7:
8:
9:
>
I checked and the path of the files in sample sheet is fine. I am using Bed files as input. Thanks a lot in advance for your help!!
Best,
U
Here is the output of sessioninfo if you need:
> sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Sierra 10.12.4
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods
[9] base
other attached packages:
[1] DiffBind_2.4.8 SummarizedExperiment_1.6.4
[3] DelayedArray_0.2.7 matrixStats_0.52.2
[5] Biobase_2.36.2 GenomicRanges_1.28.5
[7] GenomeInfoDb_1.12.2 IRanges_2.10.3
[9] S4Vectors_0.14.4 BiocGenerics_0.22.0
loaded via a namespace (and not attached):
[1] edgeR_3.18.1 bit64_0.9-7 splines_3.4.1
[4] gtools_3.5.0 assertthat_0.2.0 latticeExtra_0.6-28
[7] amap_0.8-14 RBGL_1.52.0 blob_1.1.0
[10] GenomeInfoDbData_0.99.0 Rsamtools_1.28.0 ggrepel_0.6.5
[13] Category_2.42.1 RSQLite_2.0 backports_1.1.0
[16] lattice_0.20-35 glue_1.1.1 limma_3.32.6
[19] digest_0.6.12 RColorBrewer_1.1-2 XVector_0.16.0
[22] checkmate_1.8.3 colorspace_1.3-2 Matrix_1.2-11
[25] plyr_1.8.4 GSEABase_1.38.1 XML_3.98-1.9
[28] pkgconfig_2.0.1 pheatmap_1.0.8 ShortRead_1.34.1
[31] biomaRt_2.32.1 genefilter_1.58.1 zlibbioc_1.22.0
[34] xtable_1.8-2 GO.db_3.4.1 scales_0.5.0
[37] brew_1.0-6 gdata_2.18.0 BiocParallel_1.10.1
[40] tibble_1.3.4 annotate_1.54.0 ggplot2_2.2.1
[43] GenomicFeatures_1.28.4 lazyeval_0.2.0 magrittr_1.5
[46] survival_2.41-3 memoise_1.1.0 systemPipeR_1.10.2
[49] fail_1.3 gplots_3.0.1 hwriter_1.3.2
[52] GOstats_2.42.0 graph_1.54.0 tools_3.4.1
[55] BBmisc_1.11 stringr_1.2.0 sendmailR_1.2-1
[58] munsell_0.4.3 locfit_1.5-9.1 bindrcpp_0.2
[61] AnnotationDbi_1.38.2 Biostrings_2.44.2 compiler_3.4.1
[64] caTools_1.17.1 rlang_0.1.2 grid_3.4.1
[67] RCurl_1.95-4.8 rjson_0.2.15 AnnotationForge_1.18.2
[70] bitops_1.0-6 base64enc_0.1-3 gtable_0.2.0
[73] DBI_0.7 R6_2.2.2 GenomicAlignments_1.12.2
[76] dplyr_0.7.3 rtracklayer_1.36.4 bit_1.1-12
[79] bindr_0.1 KernSmooth_2.23-15 stringi_1.1.5
[82] BatchJobs_1.6 Rcpp_0.12.12
Thanks so much Rory! Please find the output below. Everything seems okay though. I am really stuck at this step and your help is very much appreciated.
I can share the data etc. with you if you wish to have a look. Thanks again for your help!
> obj <- dba(sampleSheet='liver_H3K27Ac_samples.csv')
> obj$class[10:11,]
S1
bamRead "/Volumes/ChIP-seq/Bed_resample/S1_H3K27ac.bed"
bamControl "/Volumes/ChIP-seq/Bed_resample/S1_H3.bed"
S2
bamRead "/Volumes/ChIP-seq/Bed_resample/S2_H3K27ac.bed"
bamControl "/Volumes/ChIP-seq/Bed_resample/S2_H3.bed"
S3
bamRead "/Volumes/ChIP-seq/Bed_resample/S3_H3K27ac.bed"
bamControl "/Volumes/ChIP-seq/Bed_resample/S3_H3.bed"
S4
bamRead "/Volumes/ChIP-seq/Bed_resample/S4_H3K27ac.bed"
bamControl "/Volumes/ChIP-seq/Bed_resample/S4_H3.bed"
S5
bamRead "/Volumes/ChIP-seq/Bed_resample/S5_H3K27ac.bed"
bamControl "/Volumes/ChIP-seq/Bed_resample/S5_H3.bed"
S6
bamRead "/Volumes/ChIP-seq/Bed_resample/S6_H3K27ac.bed"
bamControl "/Volumes/ChIP-seq/Bed_resample/S6_H3.bed"
> file.exists(unique(obj$class[10,]))
[1] TRUE TRUE TRUE TRUE TRUE TRUE
> file.exists(unique(obj$class[11,]))
[1] TRUE TRUE TRUE TRUE TRUE TRUE
The next step is to check the bed files to make sure they are well-formed. You are using 12 files, and there is some issue with 9 of them. I expect the warnings to state the name of the file which makes it easy to see which file shave a problem, but in this case we are seeing completely blank warning messages (!).
We could start with seeing the
head
(first six lines or so) of each bed file to make sure it is well-formed. If the issue is somewhere deep in the bed files, we'll need the whole file. If you could give me access to at least four of your bed files we can be sure at least one of them has an issue. Could you put these up somewhere I can access them, as well as the DBA objectobj
.It is rare these days to see sequencing alignments in a bed file, indeed we have been considering removing support for bed alignments in an upcoming release.
-Rory