Hi,
I have run Diffbind on a set of ChIP-seq samples and generated a dba.count object. Now, I want to count reads on a set of peaks (promoters/enhancers) that I pre-defined and got this error:
H3K27ac.dba<- dba(sampleSheet="my_full_diffbind.csv", scoreCol= 7, filter=80, peakFormat = "macs")
## using RPKM for normalization (reads number in the bam files)
H3K27ac_RPKM<- dba.count(H3K27ac.dba, minOverlap=2,
fragmentSize = 200, bParallel = T,
score = DBA_SCORE_RPKM)
mypeaks<- import("my_enhancers.bed", format="BED")
mypeaks.count <- dba.count(H3K27ac_RPKM, peaks= mypeaks, bParallel = T,
+ fragmentSize = 200, score = DBA_SCORE_RPKM, minOverlap = 0)
Error in `$<-.data.frame`(`*tmp*`, "chrmap", value = c("chr1", "chr10", :
replacement has 24 rows, data has 45728
Thanks very much.
Ming
I was trying to reuse the precomputed `dba.count()` to save some time (I saved it as an r object, and later loaded it) .
The first `dba.count` is for the peaks called by MACS, the second `dba.count` is for the same samples but using pre-defined regions.
I later use dba.count directly with my enhancer regions. only thing I need to do is to modify the samplesheet .csv file to specify the enhancer peak format as bed, which initially I was lazy to do.
Thanks very much!
Tommy
There's no real benefit to "precomputing" (counting). When you call
dba.count()
with a different peakset, it re-does all of the counting from scratch.