Question: DiffBind - Error in dba.count() when passing mask or peaks
0
gravatar for gaia.zaffaroni
3.8 years ago by
gaia.zaffaroni0 wrote:

Hello,

I have a problem with the dba.count() function.

I have two conditions with 3 replicates each, and when I do

sample = dba(sampleSheet=sampleSheet, peakFormat='bed')
sample=dba.count(sample)

or even

sample=dba.count(sample,minOverlap=6)

I have no problem. Anyway, when I try to find consensus peaksets for the two condition separately and then count with their overlap, dba.count() is not working.

prova=dba.peakset(prova,consensus=DBA_CONDITION,minOverlap=2)
peaks=dba.peakset(prova,prova$masks$Consensus,bRetrieve=T)

prova = dba.count(prova,peaks=peaks) ###NOT WORKING
prova = dba.count(prova,peaks=prova$masks$Consensus)    #NOT WORKING

Error in if (is.unsorted(unique(pv$vectors[, 1]))) { :
  missing value where TRUE/FALSE needed

traceback()
5: pv.vectors(model, mask = mask, minOverlap = minOverlap, bKeepAll = bKeepAll,
       bAnalysis = bAnalysis, attributes = attributes)
4: pv.model(spare)
3: pv.CalledMasks(pv, res, bed)
2: pv.counts(DBA, peaks = peaks, minOverlap = minOverlap, defaultScore = score,
       bLog = bLog, insertLength = fragmentSize, bOnlyCounts = T,
       bCalledMasks = TRUE, minMaxval = filter, bParallel = bParallel,
       bUseLast = bUseLast, bWithoutDupes = bRemoveDuplicates, bScaleControl = bScaleControl,
       filterFun = filterFun, bLowMem = bUseSummarizeOverlaps, readFormat = readFormat,
       summits = summits, minMappingQuality = mapQCth)

What is wrong?

Thanks in advance,

Gaia

diffbind • 2.2k views
ADD COMMENTlink modified 3.7 years ago by kmavrommatis0 • written 3.8 years ago by gaia.zaffaroni0
Answer: DiffBind - Error in dba.count() when passing mask or peaks
0
gravatar for Rory Stark
3.8 years ago by
Rory Stark3.0k
CRUK, Cambridge, UK
Rory Stark3.0k wrote:

Hi Gaia-

Three things that may be useful to help in tracking this down:

  • Can you let me know which version of DiffBind you are working with (via sessionInfo())?
  • Run the dba.count() call with bParallel=FALSE; in serial mode it prints out each file as it counts so we can identify which file is the problem.
  • If you could sen me a copy of the "prova" DBA object (or link where I can download) , I can have a look to see if something is going wrong with the consensus peaks.

Cheers-

Rory

ADD COMMENTlink written 3.8 years ago by Rory Stark3.0k

Hi Rory, thanks for your help.

I am using DiffBind_1.16.3. This is what I get without parallel:

prova = dba.count(prova,peaks=prova$masks$Consensus,bParallel=FALSE)
Sample: CHN080-alignedreads.bam125
Sample: CHN081-alignedreads.bam125
Sample: CHN082-alignedreads.bam125
Sample: CHN083-alignedreads.bam125
Sample: CHN084-alignedreads.bam125
Sample: CHN085-alignedreads.bam125
Error in if (is.unsorted(unique(pv$vectors[, 1]))) { :
  missing value where TRUE/FALSE needed

You can find the "prova" object here:https://drive.google.com/file/d/0B5haNz0A0-UVOW11TnlzcGhGTDA/view?usp=sharing

ADD REPLYlink written 3.8 years ago by gaia.zaffaroni0
Answer: DiffBind - Error in dba.count() when passing mask or peaks
0
gravatar for Rory Stark
3.8 years ago by
Rory Stark3.0k
CRUK, Cambridge, UK
Rory Stark3.0k wrote:

Hi Gaia-

If I understand correctly, you want to use a consensus peaksets made up of all the peaks that appear in two out of three replicates of each condition, right?

The trick here is to compute the consensus peaksets in a separate DBA object than the "main" one:

> prova.cons <- dba.peakset(prova,consensus=DBA_CONDITION,minOverlap=2)
> peaks <- dba.peakset(prova.cons,prova.cons$masks$Consensus,bRetrieve=T)

Then use the main DBA object to continue the counting and analysis:

> prova <- dba.count(prova,peaks=peaks) ##SHOULD WORK

If you want a consensus set that includes peaks identified in at least two replicates in either condition (the more usual consensus for this sort of analysis, as peaks that are identified consistently in one condition but not the either are likely to be of interest in a differential binding analysis), you would do:

> peaks <- dba.peakset(prova.cons,prova.cons$masks$Consensus,minOverlap=1,bRetrieve=T)
> prova <- dba.count(prova,peaks=peaks)

Hope this helps!

Cheers-

Rory

P.S. I've added a more clear error message in the development version to replace the obscure error if someone tries to run dba.count() using a DBA object that contains Consensus peaksets(s).

ADD COMMENTlink modified 3.8 years ago • written 3.8 years ago by Rory Stark3.0k

Hi Rory,

yes that was my idea. Thanks for pointing out that it is not the usual approach, I will go for the usual consensus then.

For sake of completeness, I went on trying your suggestion, but the same error comes up, even if I use a different DBA object to compute the consensus peaks.

Cheers,

Gaia

 

ADD REPLYlink written 3.8 years ago by gaia.zaffaroni0

Let me think on this a bit!

ADD REPLYlink written 3.8 years ago by Rory Stark3.0k
Answer: DiffBind - Error in dba.count() when passing mask or peaks
0
gravatar for kmavrommatis
3.7 years ago by
kmavrommatis0 wrote:

Hi,

I have the same problem,

I load the dba object with:

dbaObj=dba(sampleSheet = df.dba.ss,bCorPlot=FALSE, minOverlap = 1, attributes=c(DBA_CONDITION,DBA_ID,DBA_CONTROL))

then calculate the consensus peaks

dbaObj=dba.peakset( dbaObj, consensus=DBA_CONDITION,minOverlap = 3)

peaksworth=dba.peakset(  dba( dbaObj,mask=dbaObj3$masks$Consensus),  bRetrieve=TRUE)

dbaObj.reads=dba.count( dbaObj, peaks=unique(peaksworth),summits = TRUE)

and I get:

Error in if (is.unsorted(unique(pv$vectors[, 1]))) { : 
  missing value where TRUE/FALSE needed 
5 pv.vectors(model, mask = mask, minOverlap = minOverlap, bKeepAll = bKeepAll, 
    bAnalysis = bAnalysis, attributes = attributes) 
4 pv.model(spare) 
3 pv.CalledMasks(pv, res, bed) 
2 pv.counts(DBA, peaks = peaks, minOverlap = minOverlap, defaultScore = score, 
    bLog = bLog, insertLength = fragmentSize, bOnlyCounts = T, 
    bCalledMasks = TRUE, minMaxval = filter, bParallel = bParallel, 
    bUseLast = bUseLast, bWithoutDupes = bRemoveDuplicates, bScaleControl = bScaleControl,  ... 
1 dba.count(dbaObj, peaks = unique(peaksworth), 
    summits = TRUE) 

 

 

Any advice welcome.

Thanks

Kostas

ADD COMMENTlink written 3.7 years ago by kmavrommatis0

I am answering my own question for future reference.

As it turns out the problem was the presence of non standard chromosomes in the peaksets.

After I processed the input files and kept only chr1.. chr22, chrX, chrY the program worked without problems.

R version 3.2.2 (2015-08-14)

DiffBind_1.16.3   

 

ADD REPLYlink written 3.6 years ago by kmavrommatis0

It may be worth trying this again with DiffBind 2.0, which should be released in the next few days as part of Bioconductor 3.3. The code for dealing with chromosome names has been completely re-written and will (hopefully) deal with non-standard cases now.

-R

ADD REPLYlink written 3.6 years ago by Rory Stark3.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 165 users visited in the last hour