Question

Integrating the computed spike-in coefficients for the normalisation before differential peaks analysis

0

Entering edit mode

Konstantin Okonechnikov ▴ 40

@konstantin-okonechnikov-11325

Last seen 4.0 years ago

Hi! Have a question about using DiffBind for ChIP-seq data with drosophila spike-ins.

I already have computed peaks from usage of control data and spike-in normalisation coefficients for a cohort of target samples across two conditions. The spike-in coefficients computation is based on drosophila alignment results as described in ActiveMotif documentation for down-sampling (adjustment based on minimum).

Is there an easy way to use these computed coefficients for the DiffBind analysis? I found this post, but there is a link to pipeline which starts from reads, while I would like only to correct the computed scores matrix before differential analysis. The matrix is computed from original samples without subsampling, but with the usage of control.

DiffBind Chipseq spike-ins • 2.0k views

ADD COMMENT • link updated 4.1 years ago by Rory Stark ★ 5.2k • written 5.1 years ago by Konstantin Okonechnikov ▴ 40

score 1 · Answer 1 · 2019-11-19

1

Entering edit mode

Rory Stark ★ 5.2k

@rory-stark-5741

Last seen 5 weeks ago

Cambridge, UK

You could retrieve the computed read counts using dba.peakset(), then correct them and use them as in the reference.

ADD COMMENT • link 5.0 years ago Rory Stark ★ 5.2k

0

Entering edit mode

Thanks a lot for the reply! My further question: how to create the object with adjusted coefficients for further processing? Is there a specific way to set the reference? Here's my code example:

# standard object creation
dtCounts <- dba.count(dt)
# retrieve peaks
peaksRes <- dba.peakset(dtCounts,1:numSamples,bRetrieve=TRUE)
# use spike-in coefficients to adjust results, here's example for one sample
peaksRes$sample1 = peaksRes$sample1 * k1
# how to re-write the target analysis object with adjusted values?
dtCountsAdj <- <???> 
# standard analysis continues....
dtCountsAdj <- dba.contrast(dtCountsAdj, categories=DBA_CONDITION)
dtCountsAdj <- dba.analyze(dtCountsAdj)

Would be grateful for comments

ADD REPLY • link 5.0 years ago Konstantin Okonechnikov ▴ 40

0

Entering edit mode

You can read in pre-set counts for a consensus peak set using dba() or dba.peakset(). The documentation is in the help page for dba.peakset() (see counts parameter), but I usually do this using dba() and a samplesheet with a column labelled Counts.

ADD REPLY • link 4.9 years ago Rory Stark ★ 5.2k

score 1 · Answer 2 · 2020-11-06

Direct support for exogenous spike-in normalization is now available in the latest release of DiffBind. The vignette has an example using Drosophila chromatin. You can supply the spike-in alignments in the sample sheet as separate bam files, or specify a specific set of chromosomes if you used a single combined reference. The reads are counted, and then you can these to calculate normalization factors (either by the total number of aligned reads, or using TMM or RLE). So long as the spike-ins were quantified correctly in the original experiment. these methods work fine. There is also a clean interface to supply externally calculated normalization factors, or a matrix of offsets, if you want to use one of the more complicated modelling methods for dealing with spike-in data.