Question

Difbind: output rows from binding heatmap plot

0

Entering edit mode

svohra • 0

@svohra-11397

Last seen 9.4 years ago

Cambridge

Hi,

I have generated a heatmap based on binding sites using DiffBind package. I was wondering if it is possible to output the rows (dendogram) in the order as they appear on the heatmap.

Many Thanks,

diffbind • 2.0k views

ADD COMMENT • link updated 9.4 years ago by Rory Stark ★ 5.2k • written 9.4 years ago by svohra • 0

score 1 · Answer 1 · 2016-09-01

1

Entering edit mode

Rory Stark ★ 5.2k

@rory-stark-5741

Last seen 13 months ago

Cambridge, UK

The binding sites are returned invisibly in a GRanges object, in the order they appear in the plot. So all you have to do is to assign the return value of the dba.plotHeatmap() [or plot()] to a variable. For example:

> data(tamoxifen_counts)
> bsites <- dba.plotHeatmap(tamoxifen, correlations=FALSE)
> bsites

Cheers-

Rory

ADD COMMENT • link 9.4 years ago Rory Stark ★ 5.2k

0

Entering edit mode

Hi,

I have tried this option but I get a value of null. I use the following commands
>peaks=dba(sampleSheet="samplesheet.csv")
>rawcounts=dba.count(peaks)
>x= dba.plotHeatmap(rawcounts,correlations=FALSE)
> x
NULL

Any suggestions?

Thanks,

ADD REPLY • link 9.4 years ago svohra • 0

1

Entering edit mode

What version of Bioconductor/DiffBind are you using? The feature to return the binding sites in order was introduced in DiffBind v1.16, which was part of Bioconductor 3.2. The current version is Bioconductor 3.3/DiffBind 2.0.x.

ADD REPLY • link 9.4 years ago Rory Stark ★ 5.2k

0

Entering edit mode

I upgraded Diffbind ( DiffBind_2.0.6) but I am not getting some error while loading the samplesheet. > peakSet=dba(sampleSheet="samplesheet.csv")
Error in if (sum(changed) > 0) { : missing value where TRUE/FALSE needed

ADD REPLY • link 9.4 years ago svohra • 0

0

Entering edit mode

Hmmn, that error is in a function that checks the strings in the sample sheet. If you could send me a copy of your samplesheet, I can have a look.

rory.stark@cruk.cam.ac.uk

P.S. If you had an existing DBA object from a previous version, and save it with dba.save(), you should be able to read it into the new version using dba.load(), which converts between versions.

ADD REPLY • link 9.4 years ago Rory Stark ★ 5.2k

0

Entering edit mode

Looks like some new error-checking code introduced a bug when dealing with NAs in the samplesheet. I'll check in a fix, but in the meantime if you delete the columns that have NAs it should work.

-R

ADD REPLY • link 9.4 years ago Rory Stark ★ 5.2k

0

Entering edit mode

I am having another issue, not entirely sure if it is a bug. I am trying
to count the reads on a peakset generated by another program. It use to
work fine with the previous version of DiffBind but now I am getting an
error
peakSet = dba(sampleSheet="samplesheet.txt")
new= dba.peakset(peakSet, peaks="promoter.peaks.bed") 
rawcount= dba.count(peakSet, peaks=new$masks[[6]])

Error in pv$peaks[[i]] : subscript out of bounds

ADD REPLY • link 9.4 years ago svohra • 0

0

Entering edit mode

I'm not exactly what you are trying to do here, but I see a couple of problems with this script.

1. The second line adds a peakset (contained in the file promoter.peaks.bed) to an existing DBA object (peakSet) which when returned is assigned to a new variable (new). In the third line, you are counting using the original DBA object (peakSet), but using a mask from the "new" DBA object. As the new DBA object has an additional sample peakset in it, the masks can not be used with the original peakset (mask will be the wrong length), hence the error you are seeing.

2. It doesn't make sense to use a mask as the value for the peaks parameter to dba.count() -- this has to be an actual peakset. A mask is used to limit which samples are used ("masking out" some of the samples). A peakset specifies which binding site intervals to include.

What I think you want to do is to have the peaks in promoter.peaks.bed serve as the peakset to use when counting the reads in the samples in the original peakSet DBA object. There are number of ways to do this, but the most straightforward way is:

> peakSet <- dba(sampleSheet="samplesheet.txt")
> rawcount <- dba.count(peakSet, peaks="promoter.peaks.bed")

However there may be a bug in this (I am checking in a fix that should appear as DiffBind 2.0.7). A workaround is to read the peaks in first, giving:

> peakSet <- dba(sampleSheet="samplesheet.txt")
> promoterPeaks <- read.table("promoter.peaks.bed")
> rawcount <- dba.count(peakSet, peaks=promoterPeaks)

-Rory

ADD REPLY • link 9.4 years ago Rory Stark ★ 5.2k