Search
Question: dba.count binding matrix coordinates
1
gravatar for jingyaq
13 days ago by
jingyaq10
jingyaq10 wrote:

Hi,

Thanks for this great package. I just have a question about the output of dba.count.

I'm trying to generate a count matrix with dba.count, but noticed a strange mixup with the coordinates. Attached below is a toy example that I hope clearly shows the issue.

consensus_peakset is a GRanges object of a peak set I retrieved using dba.peakset. I ran dba.count on a few peaks from 2 chromosomes (chr1, chr15). The resulting db$binding matrix has the same start and end coordinates, but the CHR column shows chromosomes 1 and 2. One of these must be incorrect.

I noticed the chromosome name format changed from 'chrX' to just 'X', and I imagine that has something to do with the reordering of factor levels. Am I doing something wrong, or is this a bug?

Thanks again for your help!

ADD COMMENTlink modified 12 days ago by Rory Stark2.6k • written 13 days ago by jingyaq10
2
gravatar for Rory Stark
12 days ago by
Rory Stark2.6k
CRUK, Cambridge, UK
Rory Stark2.6k wrote:

The issue is how you are retrieving the binding matrix. Accessing it directly (db$binding) results in an internal dataframe being returned. The CHR column in this is an index into a separate set of chromosome names (db$chrmap), not a chromosomes "number" . So the index of the chromosome labelled "chr2" is not necessarily the number 2!

The officially supported way of retrieving the binding matrix is to use dba.peakset() with bRetrieve=TRUE. This will do the chromosome name translation and the matrix will be returned as a GRanges object (or as a dataframe if you set DataType=DBA_DATA_FRAME).

ADD COMMENTlink written 12 days ago by Rory Stark2.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 360 users visited in the last hour