Question

consensus peak counts

0

Entering edit mode

mm2489 ▴ 20

@mm2489-7705

Last seen 7.3 years ago

United States

Hi

I have a question about diffbind counts. I read in the vignette that the binding affinity
matrix contains a normalized read count for each sample. I was wondering how exactly this normalization is done and if I can directly compare counts from one sample to another.

Thanks

diffbind • 1.1k views

ADD COMMENT • link updated 9.6 years ago by Rory Stark ★ 5.2k • written 9.6 years ago by mm2489 ▴ 20

score 0 · Answer 1 · 2015-05-06

Hi-

The normalization is controlled by the score parameter to dba.count(). the man page shows all the possible normalization scoring schemes. The default normalization method, DBA_SCORE_TMM_MINUS_FULL, uses the TMM normalization method from the edgeR package to normalize reads counts (with control reads subtracted, using the total number of reads in the .bam files).

You can easily change the normalization score without having to re-count reads by calling dba.count() with peaks=NULL. For example,

> DBA <- dba.count(DBA, peaks=NULL, score=DBA_SCORE_READS)

will change it to use raw read counts, while

> DBA <- dba.count(DBA, peaks=NULL, score=DBA_SCORE_RPKM)

will change it to use RPKM values, which would be comparable between experiments.

Note that when you actually run a differential analysis, a normalization scheme specific to the analysis method (edgeR or DESeq2) will be used for that analysis without changing the scores in the global affinity matrix.

Hope this helps-

Rory