Clarification about the functionality of dba.count and dba.analyze
1
0
Entering edit mode
JKUMAR12 • 0
@jkumar12-8658
Last seen 8.7 years ago
United States

I am trying to get some clarity on dba.count and dba.analyze. 

1) For dba.count, does the function look at overlapping peaks and instead of using the score assigned by the peak caller (which I understand dba() does), it uses the number of reads found at that peak for a sample and calculates how well the reads correlate with that of the same peak in another sample? Or does it just look overall at reads throughout the entire genome regardless of where peaks are called?

2) When you set up the contrasts and then execute dba.analyze, are peaks of each sample in each contrast group pooled together to do the diff analysis?

Thank you!
Jaya

diffbind • 1.2k views
ADD COMMENT
1
Entering edit mode
Rory Stark ★ 5.1k
@rory-stark-5741
Last seen 12 days ago
Cambridge, UK

Hello Jaya-

1. dba.count() counts reads for all consensus peaks for all samples. Using the default overlapping method of making a consensus peakset, it will look in regions identified as peaks in at least two samples, but count the reads in those regions for every sample, whether or not the peak was identified for that sample.

2. When dba.analyze() is invoked using either edgeR or DESeq/DESeq2, the replicate samples in each group aren't really pooled. Instead they are used to determine how well the samples within the group agree. Groups whose replicate samples have lower variance will result in better confidence scores (lower p-values and FDR).

Hope this helps-

Rory

ADD COMMENT
0
Entering edit mode

Thanks Rory! This was helpful. 

ADD REPLY

Login before adding your answer.

Traffic: 493 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6