DiffBind Error when dba.count on a pre-defined peak set
1
0
Entering edit mode
tangming2005 ▴ 200
@tangming2005-6754
Last seen 9 weeks ago
United States

Hi,

I have run Diffbind on a set of ChIP-seq samples and generated a dba.count object. Now, I want to count reads on a set of peaks (promoters/enhancers) that I pre-defined and got this error:

H3K27ac.dba<- dba(sampleSheet="my_full_diffbind.csv", scoreCol= 7, filter=80, peakFormat = "macs")

## using RPKM for normalization (reads number in the bam files)
H3K27ac_RPKM<- dba.count(H3K27ac.dba, minOverlap=2, 

                      fragmentSize = 200, bParallel = T,
                      score = DBA_SCORE_RPKM)

mypeaks<- import("my_enhancers.bed", format="BED")

mypeaks.count <- dba.count(H3K27ac_RPKM, peaks= mypeaks, bParallel = T,

+                                      fragmentSize = 200, score = DBA_SCORE_RPKM, minOverlap = 0)

Error in `$<-.data.frame`(`*tmp*`, "chrmap", value = c("chr1", "chr10",  : 

  replacement has 24 rows, data has 45728

 

Thanks very much.

Ming

diffbind chipseq • 1.4k views
ADD COMMENT
0
Entering edit mode
Rory Stark ★ 5.2k
@rory-stark-5741
Last seen 5 weeks ago
Cambridge, UK

Hi Ming-

There appears to be a bug here. We'll have to wait a short while to fix it in the new release as the current release is closed to new fixes.

I do have a workaround for you however: if you set bUseSummarizeOverlaps=TRUE when calling dba.count(), this should work (although it won't support the fragmentSize=200 parameter).

I am a little curious about what you are trying to do, specifically why you are calling dba.count() two times, and passing in the result of the first call to the second call. If you want to count based on your predefined enhancers, you can pass them into the first call to dba.count(). Also, whenever you pass in pre-defined peaks, the minOverlap parameter is ignored as it only applies when a new set of consensus peaks are being derived.

Apologies for the inconvenience, I'll post here when we are able to check in a fix.

-Rory

ADD COMMENT
0
Entering edit mode

I was trying to reuse the precomputed `dba.count()` to save some time (I saved it as an r object, and later loaded it) .  

The first `dba.count` is for the peaks called by MACS, the second `dba.count` is for the same samples but using pre-defined regions.

I later use dba.count directly with my enhancer regions. only thing I need to do is to modify the samplesheet  .csv file to specify the enhancer peak format as bed, which initially I was lazy to do.

Thanks very much!

Tommy

 

 

ADD REPLY
1
Entering edit mode

There's no real benefit to "precomputing" (counting). When you call dba.count() with a different peakset, it re-does all of the counting from scratch.

ADD REPLY

Login before adding your answer.

Traffic: 531 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6