Question: DiffBind question: How to extract read counts?
0
20 months ago by
k.panov0
k.panov0 wrote:

Hi Rory,

I wonder, if it is possible to extract counting data (after performing dba.count) from a resulting DBA object in csv format where consensus peak names are in the first column, peak positions  are in the second/third/forth colums (chromosome, start, end) and  normalized number of reads for each sample are in subsequent columns.

Regards

Konstantin

modified 20 months ago by Rory Stark3.0k • written 20 months ago by k.panov0
0
20 months ago by
Rory Stark3.0k
CRUK, Cambridge, UK
Rory Stark3.0k wrote:

I'm not sure what you mean by consensus peak names? The peak intervals are specified only by their positions, the closest they have t oa name is a number (1:numpeaks).

Here's one way of getting the columns you want in a csv (with the peak number as the peak name):

> normCounts <- dba.peakset(myDBA, bRetrieve=TRUE, DataType=DBA_DATA_FRAME)
> write.csv(normCounts, file="normalized_counts.csv")

Thank you very much Rory, that what I need . Regarding peak names, I just was not sure if any peak names (e.g. Peak 1, Peak 2 and so for) were given in addition to coordinates.

Regards

Konstantin

Hi Rory,

Just to clarify, the counts extracted by dba.peakset function, aren't normalized to the library size, aren't they?

Regards

Konstantin

The values to be returned are controlled by the score parameter passed to dba.count(). The default values are TMM normalized, which takes library size into account. (You can tell they're normalized because they are not integers).

You can change the score without having to re-count the reads by calling dba.count() with peaks=NULL and score=DBA_SCORE_READS (or any of the other scores detailed on the dba.count() man page).