It would be useful to see what is actually in the features object. My guess is that the table of enhancer regions is not in a valid format. Optimally this is a GRanges object, and the chromosome names must match those in the reference genome used for alignment.
FYI, there is no need to set minOverlap=1 in this call, as this argument is ignored when you supply a consensus dataset using the peaks parameter.
Finally, I just identified the problem: in the table of enhancer regions, some features has a length of 0 (!), with chromStart = chromEnd. After removing these features, dba.count runs without error.
Regarding minOverlap=1, I have to set this argument during my tests where I kept only one sample in ATAC_sampleSheet.csv (to decrease computation time), otherwise I had an error message minOverlap can not be greater than the number of peaksets [1].
But how retrieve summit value for each peak? I read:
summit heights (read pileup) and locations will be calculated for each peak. The values can retrieved using dba.peakset
When i do peaks.list <- dba.peakset(DBA = exp.dba, bRetrieve = T, DataType = "DBA_DATA_FRAME")
it seems that retrieved values are more of a coverage (total reads number including in each feature) than the summit of the peak?
I'm checking in a fix for the minOverlap error message, thanks for pointing it out!
You need to set score = DBA_SCORE_SUMMIT_POS to have the scores be the summit position. You can either do this in the original call to dba.count(), or call it a second time (by setting peaks=NULL, it will not re-count):
I'm checking in a fix for the
minOverlap
error message, thanks for pointing it out!You need to set
score = DBA_SCORE_SUMMIT_POS
to have the scores be the summit position. You can either do this in the original call todba.count()
, or call it a second time (by settingpeaks=NULL
, it will not re-count):ok thank you for this precision