Counting Raw Reads with DiffBind
1
0
Entering edit mode
@jessschust-9100
Last seen 9.1 years ago
United States

Hello, 

I am trying to use diffbind on paired end read data. I have done the following to get our proper peaks of interest ( we have technical replicates: 

myDBA=dba(sampleSheet="All_samplesheet.csv", attributes=c(DBA_ID,DBA_CONDITION,DBA_REPLICATE))

myDBA=dba.peakset(myDBA, consensus=-DBA_REPLICATE, minOverlap=3)

I then used Rory's instructions to extract the raw read counts (this is what I need for our next step): 

myDBA <- dba.count(myDBA, peaks=NULL, score=DBA_SCORE_READS)

counts <- dba.peakset(myDBA, bRetrieve=TRUE)

However, when I look at the bam files (using IGV) there appear to be  way more reads than are reported in the count matrix.  How exactly are the reads counted? I have been using your raw read matrix to move forward in my analysis and I am concerned that I am using at the wrong counts.  

Any advice would be greatly appreciated.  

~Jessica 

diffbind • 2.0k views
ADD COMMENT
4
Entering edit mode
Gord Brown ▴ 670
@gord-brown-5664
Last seen 3.9 years ago
United Kingdom

Hi, Jessica,

Sorry for the slow response.  Rory passed along the issue (and your data) to me.  As far as I can tell the answer is that DiffBind by default removes reads that have a mapping quality below 15.  Particularly near centromeres and telomeres, quite a few reads tend to fall into this category.  We usually remove them prior to analysis, since we aren't confident that they're correctly mapped.

If you re-count with:

myDBA <- dba.count(myDBA, score=DBA_SCORE_READS, mapQCth=0, bRemoveDuplicates=FALSE)

you should get everything.  Let me know if you still don't see the numbers you're expecting.  Also, you can try adding the parameter "bUseSummarizedOverlaps=TRUE" which uses a slightly different counting algorithm.

Cheers,

 - Gord

ADD COMMENT

Login before adding your answer.

Traffic: 495 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6