Having trouble to get the ChIPQC to work properly. My ultimate goal is to have the html report generated using ChIPQCreport().
I read in the bam files through dba function in the DiffBind package. All went well and I was able to run all DiffBind functions.
Using the same sample sheet and dba object does not work with ChIPQC.
I tried read in sample sheet directly or the dba object, both failed.
I have 8 ChIP-seq samples and two input files as controls. All bam files are sorted and indexed. Both .bam and .bai files are in the same folder. All chromosome number are the same in bam and bed files, only digits and no "chr" before the digits. T
here are some discrepancy between the chromosomes, in .bam files, there are more chromosomes but in the .bed file, there were only "chromosome 1, 3, 5, 6, 7, 9, 10, 11, 13, 16, 17, 18, X". Although I don't think this should be an issue since the peak file (summit file) (.bed) are called peaks from MACS, which may not have all chromosomes as the .bam files do.
Here's the error message I got when using dba object to run ChIPQC (using only chromosome 18 as example)
Using the dba object:
dataset <- dba(sampleSheet="/path_to_peakset/peaksets_20180910.csv") dataset_QC = ChIPQC(dataset, annotation="mm9", chromosomes = "18")
These are the error message appeared:
Bam file has 211 contigs Error in value[[3L]](cond) : failed to open BamFile: failed to load BAM index file: /Users/yiko/Dropbox/0_Rader_Lab/Lab_Members/Xin_Bi/diffbind/BAM/Input_KOAligned.sortedByCoord.out.bam In addition: Warning messages: 1: In if (is.na(peaks)) peaks = NULL : the condition has length > 1 and only the first element will be used 2: In if (is.na(peaks)) peaks = NULL : the condition has length > 1 and only the first element will be used 3: In if (is.na(peaks)) peaks = NULL : the condition has length > 1 and only the first element will be used 4: In if (is.na(peaks)) peaks = NULL : the condition has length > 1 and only the first element will be used 5: In if (is.na(peaks)) peaks = NULL : the condition has length > 1 and only the first element will be used 6: In if (is.na(peaks)) peaks = NULL : the condition has length > 1 and only the first element will be used 7: In if (is.na(peaks)) peaks = NULL : the condition has length > 1 and only the first element will be used 8: In doTryCatch(return(expr), name, parentenv, handler) : [bam_index_load] fail to load BAM index.
Using the ChIPQC function directly:
dataset2 = ChIPQC(experiment = "/path_to_samplesheet/peaksets_20180910.csv", annotation="mm9", bCount = TRUE, chromosomes = "18")
Error message from ChIPQC function:
Bam file has 211 contigs Error in value[[3L]](cond) : failed to open BamFile: failed to load BAM index file: /Users/yiko/Dropbox/0_Rader_Lab/Lab_Members/Xin_Bi/diffbind/BAM/Input_KOAligned.sortedByCoord.out.bam In addition: Warning messages: 1: In if (is.na(peaks)) peaks = NULL : the condition has length > 1 and only the first element will be used 2: In if (is.na(peaks)) peaks = NULL : the condition has length > 1 and only the first element will be used 3: In if (is.na(peaks)) peaks = NULL : the condition has length > 1 and only the first element will be used 4: In if (is.na(peaks)) peaks = NULL : the condition has length > 1 and only the first element will be used 5: In if (is.na(peaks)) peaks = NULL : the condition has length > 1 and only the first element will be used 6: In if (is.na(peaks)) peaks = NULL : the condition has length > 1 and only the first element will be used 7: In if (is.na(peaks)) peaks = NULL : the condition has length > 1 and only the first element will be used 8: In if (is.na(peaks)) peaks = NULL : the condition has length > 1 and only the first element will be used 9: In if (is.na(peaks)) peaks = NULL : the condition has length > 1 and only the first element will be used 10: In doTryCatch(return(expr), name, parentenv, handler) : [bam_index_load] fail to load BAM index.
sessionInfo()
R version 3.4.1 (2017-06-30) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: macOS High Sierra 10.13.6 Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets methods base other attached packages: [1] BiocParallel_1.10.1 ChIPQC_1.12.3 ggplot2_3.0.0 XML_3.98-1.16 bindrcpp_0.2.2 DiffBind_2.4.8 SummarizedExperiment_1.6.5 [8] DelayedArray_0.2.7 matrixStats_0.54.0 Biobase_2.36.2 GenomicRanges_1.28.6 GenomeInfoDb_1.12.3 IRanges_2.10.5 S4Vectors_0.14.7 [15] BiocGenerics_0.22.1 BiocInstaller_1.26.1
(don't have enough space to paste all info - plz let me know if you need the complete session info, thanks)
Does that file have an index file? It appears that it doesn't. Also, you should first update to the current version of R/Bioc, as we don't support outdated versions (your installation is well over a year out of date).
I do have the index files (.bai) under the same folder. I have R 3.5 installed on a server that I’ll test the same function again this afternoon. Thanks for the suggestion @James
I can't get the ChIPQC installed on the server R 3.5 because anaconda conflicts. There are many dependencies couldn't be installed. Do you think the errors in this post are due to the older R version 3.4.1?
I think you should simplify things and investigate the problem -- unable to
I think you trigger this with
and then it would be helpful to look at the files at that location, whether the index file is there, the specific name of the index file, and whether the name of the index file is consistent with the expectation in the documentation -- I *think* in your version if the base file name was foo.bam, then the index had to be foo.bam.bai
I don't know. Packages may or may not get updated at each release cycle, and that becomes even more complicated when a given package depends on functions from other packages which may or may not have been updated as well.
However, Rsamtools has a function called open.BamFile:
That tries to open a bam file that you have specified, which in turn calls some underlying C code that tries to open the bam index file, and which returns the error 'failed to open BAM index'.
So if you get an error that says
That's a really good indication that there isn't an index file for that particular bam file.
You might try doing
Directly. Dollars to donuts you get the same error, which again is a really good indication that you either don't have an index file for that bam file, or if you do, it's borked.