DiffBind and GRanges error extracting overlapping peaks using dba.overlap
1
0
Entering edit mode
Rory Stark ★ 5.1k
@rory-stark-5741
Last seen 7 days ago
Cambridge, UK
Hi Matt- Yep, it's a bug! If you plot this as a Venn, you can see one peakset has zero elements: > dba.plotVenn(chip, 16:19) There's no check when this is converted to a GRanges object. I'll fix this and check it in soon. In the mean time, you can work around this by using data frames instead of GRanges, either by: > chip.OL = dba.overlap(chip, >c(16,17,18,19),mode=DBA_OLAP_PEAKS,DataType=DBA_DATA_FRAME) or by changing the default data type: > chip$config$DataType = DBA_DATA_FRAME I tend to run with DBA_DATA_FRAME as the default, which is probably why I didn't spot this bug before. Cheers- Rory On 14/11/2013 16:00, "Matt Zinkgraf" <mzinkgraf at="" gmail.com=""> wrote: >Hi Roy >Thanks for the response. The chip object can be found at >https://dl.dropboxusercontent.com/u/96655685/chip.rdata > >Matt > >-----Original Message----- >From: Rory Stark [mailto:Rory.Stark at cruk.cam.ac.uk] >Sent: Thursday, November 14, 2013 4:05 AM >To: mzinkgraf at gmail.com >Cc: bioconductor at r-project.org >Subject: Re: DiffBind and GRanges error extracting overlapping peaks >using dba.overlap > >Hi Matt- > >Your code looks good -- this looks like a bug. It must be something about >the specific 4-way overlap that you are doing as I can't reproduce it >with some datasets I have. > >Is there a way you can share the DiffBind Object ("chip") with me so I >can debug it? Dropbox perhaps? > >Cheers- >Rory > >On 14/11/2013 01:04, "Matt Zinkgraf [guest]" <guest at="" bioconductor.org=""> >wrote: > >> >>Hello >>I am using DiffBind to identify consensus binding sites for multiple >>transcription factors that have biological replicates. In addition, I >>want to investigate the overlap of binding sites across the >>transcription factors. I am able to call consensus peaks for each >>transcription factor and calculate the overlap rate and plot overlaps >>with dba.plotVenn but I am getting an error from GRanges when trying >>to extract the actual overlapping peaks using dba.overlap and >>DBA_OLAP_PEAKS. Any suggestions on why I am getting this error? >> >>Thanks >>Matt >> >>> #load datasets >>> chip= dba(sampleSheet="chip_datasets_testing.csv", peakCaller="bed") >>a4142.1 a21 ARK2 1 bed >>a4142.2 a22 ARK2 2 bed >>a0304.1 a23 ARK2 1 bed >>a0304.2 a24 ARK2 2 bed >>r4748.1 r11 REV 1 bed >>r8586.1 r21 REV 1 bed >>r8586.2 r22 REV 2 bed >>c4344.1 c11 PCN 1 bed >>c4344.2 c12 PCN 2 bed >>c4546.1 c21 PCN 1 bed >>c4546.2 c22 PCN 2 bed >>a3738.1 a11 ARK1 1 bed >>a3738.2 a12 ARK1 2 bed >>a3940.2 a14 ARK1 2 bed >>a3940.1 a13 ARK1 1 bed >>> >>> #create consensus peaks and plot overlap chip = dba.peakset(chip, >>> consensus = DBA_TREATMENT, minOverlap = 0.5) >>Add consensus: ARK2 >>Add consensus: REV >>Add consensus: PCN >>Add consensus: ARK1 >>> >>> chip >>19 Samples, 12123 sites in matrix (29678 total): >> ID Condition Treatment Replicate Peak.caller Intervals >>1 a4142.1 a21 ARK2 1 bed 3239 >>2 a4142.2 a22 ARK2 2 bed 2026 >>3 a0304.1 a23 ARK2 1 bed 2718 >>4 a0304.2 a24 ARK2 2 bed 581 >>5 r4748.1 r11 REV 1 bed 6958 >>6 r8586.1 r21 REV 1 bed 595 >>7 r8586.2 r22 REV 2 bed 869 >>8 c4344.1 c11 PCN 1 bed 8526 >>9 c4344.2 c12 PCN 2 bed 803 >>10 c4546.1 c21 PCN 1 bed 5524 >>11 c4546.2 c22 PCN 2 bed 5320 >>12 a3738.1 a11 ARK1 1 bed 7443 >>13 a3738.2 a12 ARK1 2 bed 7004 >>14 a3940.2 a14 ARK1 2 bed 5697 >>15 a3940.1 a13 ARK1 1 bed 761 >>16 ARK2 a21-a22-a23-a24 ARK2 1-2 bed 2030 >>17 REV r11-r21-r22 REV 1-2 bed 548 >>18 PCN c11-c12-c21-c22 PCN 1-2 bed 3500 >>19 ARK1 a11-a12-a14-a13 ARK1 1-2 bed 6210 >>> >>> dba.overlap(chip,c(16,17,18,19), mode=DBA_OLAP_RATE) >>[1] 10006 1733 327 11 >>> >>> chip.OL = dba.overlap(chip, c(16,17,18,19),mode=DBA_OLAP_PEAKS) >>Error in validObject(.Object) : >> invalid class ???GRanges??? object: NROW(strand(x)) != length(x) >> >> -- output of sessionInfo(): >> >>> sessionInfo() >>R version 3.0.2 (2013-09-25) >>Platform: x86_64-w64-mingw32/x64 (64-bit) >> >>locale: >>[1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United >>States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C >>[5] LC_TIME=English_United States.1252 >> >>attached base packages: >>[1] parallel stats graphics grDevices utils datasets methods >>[8] base >> >>other attached packages: >>[1] DiffBind_1.8.2 Biobase_2.22.0 GenomicRanges_1.14.3 >>[4] XVector_0.2.0 IRanges_1.20.5 BiocGenerics_0.8.0 >> >>loaded via a namespace (and not attached): >> [1] amap_0.8-7 bitops_1.0-6 caTools_1.16 >> [4] edgeR_3.4.0 gdata_2.13.2 gplots_2.12.1 >> [7] gtools_3.1.1 KernSmooth_2.23-10 limma_3.18.2 >>[10] RColorBrewer_1.0-5 stats4_3.0.2 tools_3.0.2 >>[13] zlibbioc_1.8.0 >> >>-- >>Sent via the guest posting facility at bioconductor.org. > >
DiffBind DiffBind • 1.6k views
ADD COMMENT
0
Entering edit mode
@matt-zinkgraf-6248
Last seen 9.6 years ago
Hi Rory That did the trick. Thanks Matt -----Original Message----- From: Rory Stark [mailto:Rory.Stark@cruk.cam.ac.uk] Sent: Thursday, November 14, 2013 10:08 AM To: Matt Zinkgraf Cc: bioconductor at r-project.org Subject: Re: DiffBind and GRanges error extracting overlapping peaks using dba.overlap Hi Matt- Yep, it's a bug! If you plot this as a Venn, you can see one peakset has zero elements: > dba.plotVenn(chip, 16:19) There's no check when this is converted to a GRanges object. I'll fix this and check it in soon. In the mean time, you can work around this by using data frames instead of GRanges, either by: > chip.OL = dba.overlap(chip, >c(16,17,18,19),mode=DBA_OLAP_PEAKS,DataType=DBA_DATA_FRAME) or by changing the default data type: > chip$config$DataType = DBA_DATA_FRAME I tend to run with DBA_DATA_FRAME as the default, which is probably why I didn't spot this bug before. Cheers- Rory On 14/11/2013 16:00, "Matt Zinkgraf" <mzinkgraf at="" gmail.com=""> wrote: >Hi Roy >Thanks for the response. The chip object can be found at >https://dl.dropboxusercontent.com/u/96655685/chip.rdata > >Matt > >-----Original Message----- >From: Rory Stark [mailto:Rory.Stark at cruk.cam.ac.uk] >Sent: Thursday, November 14, 2013 4:05 AM >To: mzinkgraf at gmail.com >Cc: bioconductor at r-project.org >Subject: Re: DiffBind and GRanges error extracting overlapping peaks >using dba.overlap > >Hi Matt- > >Your code looks good -- this looks like a bug. It must be something >about the specific 4-way overlap that you are doing as I can't >reproduce it with some datasets I have. > >Is there a way you can share the DiffBind Object ("chip") with me so I >can debug it? Dropbox perhaps? > >Cheers- >Rory > >On 14/11/2013 01:04, "Matt Zinkgraf [guest]" <guest at="" bioconductor.org=""> >wrote: > >> >>Hello >>I am using DiffBind to identify consensus binding sites for multiple >>transcription factors that have biological replicates. In addition, I >>want to investigate the overlap of binding sites across the >>transcription factors. I am able to call consensus peaks for each >>transcription factor and calculate the overlap rate and plot overlaps >>with dba.plotVenn but I am getting an error from GRanges when trying >>to extract the actual overlapping peaks using dba.overlap and >>DBA_OLAP_PEAKS. Any suggestions on why I am getting this error? >> >>Thanks >>Matt >> >>> #load datasets >>> chip= dba(sampleSheet="chip_datasets_testing.csv", peakCaller="bed") >>a4142.1 a21 ARK2 1 bed >>a4142.2 a22 ARK2 2 bed >>a0304.1 a23 ARK2 1 bed >>a0304.2 a24 ARK2 2 bed >>r4748.1 r11 REV 1 bed >>r8586.1 r21 REV 1 bed >>r8586.2 r22 REV 2 bed >>c4344.1 c11 PCN 1 bed >>c4344.2 c12 PCN 2 bed >>c4546.1 c21 PCN 1 bed >>c4546.2 c22 PCN 2 bed >>a3738.1 a11 ARK1 1 bed >>a3738.2 a12 ARK1 2 bed >>a3940.2 a14 ARK1 2 bed >>a3940.1 a13 ARK1 1 bed >>> >>> #create consensus peaks and plot overlap chip = dba.peakset(chip, >>> consensus = DBA_TREATMENT, minOverlap = 0.5) >>Add consensus: ARK2 >>Add consensus: REV >>Add consensus: PCN >>Add consensus: ARK1 >>> >>> chip >>19 Samples, 12123 sites in matrix (29678 total): >> ID Condition Treatment Replicate Peak.caller Intervals >>1 a4142.1 a21 ARK2 1 bed 3239 >>2 a4142.2 a22 ARK2 2 bed 2026 >>3 a0304.1 a23 ARK2 1 bed 2718 >>4 a0304.2 a24 ARK2 2 bed 581 >>5 r4748.1 r11 REV 1 bed 6958 >>6 r8586.1 r21 REV 1 bed 595 >>7 r8586.2 r22 REV 2 bed 869 >>8 c4344.1 c11 PCN 1 bed 8526 >>9 c4344.2 c12 PCN 2 bed 803 >>10 c4546.1 c21 PCN 1 bed 5524 >>11 c4546.2 c22 PCN 2 bed 5320 >>12 a3738.1 a11 ARK1 1 bed 7443 >>13 a3738.2 a12 ARK1 2 bed 7004 >>14 a3940.2 a14 ARK1 2 bed 5697 >>15 a3940.1 a13 ARK1 1 bed 761 >>16 ARK2 a21-a22-a23-a24 ARK2 1-2 bed 2030 >>17 REV r11-r21-r22 REV 1-2 bed 548 >>18 PCN c11-c12-c21-c22 PCN 1-2 bed 3500 >>19 ARK1 a11-a12-a14-a13 ARK1 1-2 bed 6210 >>> >>> dba.overlap(chip,c(16,17,18,19), mode=DBA_OLAP_RATE) >>[1] 10006 1733 327 11 >>> >>> chip.OL = dba.overlap(chip, c(16,17,18,19),mode=DBA_OLAP_PEAKS) >>Error in validObject(.Object) : >> invalid class ???GRanges??? object: NROW(strand(x)) != length(x) >> >> -- output of sessionInfo(): >> >>> sessionInfo() >>R version 3.0.2 (2013-09-25) >>Platform: x86_64-w64-mingw32/x64 (64-bit) >> >>locale: >>[1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United >>States.1252 [3] LC_MONETARY=English_United States.1252 [4] >>LC_NUMERIC=C [5] LC_TIME=English_United States.1252 >> >>attached base packages: >>[1] parallel stats graphics grDevices utils datasets methods >>[8] base >> >>other attached packages: >>[1] DiffBind_1.8.2 Biobase_2.22.0 GenomicRanges_1.14.3 >>[4] XVector_0.2.0 IRanges_1.20.5 BiocGenerics_0.8.0 >> >>loaded via a namespace (and not attached): >> [1] amap_0.8-7 bitops_1.0-6 caTools_1.16 >> [4] edgeR_3.4.0 gdata_2.13.2 gplots_2.12.1 >> [7] gtools_3.1.1 KernSmooth_2.23-10 limma_3.18.2 >>[10] RColorBrewer_1.0-5 stats4_3.0.2 tools_3.0.2 >>[13] zlibbioc_1.8.0 >> >>-- >>Sent via the guest posting facility at bioconductor.org. > >
ADD COMMENT

Login before adding your answer.

Traffic: 783 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6