DiffBind with spike in
1
0
Entering edit mode
@9540571b
Last seen 11 months ago
United States

I am hoping to get a little more information about how to use DiffBind to deal with spike-in recalibration and the vignette is a little sparse on this topic. In section 7.6 there's the following code:


spikes <- dba.normalize(spikes, spikein = spikes.spikeins)

The spikes.spikeins is loaded as part of load(system.file('extra/spikes.rda',package='DiffBind'). The vignette states:

Note that precalculated background reads are included for the example in an object named spikes.spikeins, so we do not need to recount them for the vignette; we can pass the pre-calculated ones in instead. Normally, with access to thes pike-in reads, setting spikein=TRUE will result in the spike-in reads being counted.

I am wondering if we can get a little more code describing how spikes.spikeins was made -- dba.counts maybe?

DiffBind • 1.6k views
ADD COMMENT
0
Entering edit mode
Rory Stark ★ 5.2k
@rory-stark-5741
Last seen 27 days ago
Cambridge, UK

Spike-ins are part of normalization and are calculated in the dba.normalize() function. The help page for dba.normalize() has some documentation for how to use the spikein parameter.

The primary way to include spike-in reads is by including a separate set of aligned bam files in your sample sheet (using a column named Spikein). If spikein=TRUE, the total number of aligned reads in these tracks will be used to calculate normalization factors.

If your spike-in reads are included in the main (ChIP/ATAC) bam files, but fall on a distinct set of chromosomes (ie if you aligned to a hybrid reference genome), you don't need to add Spikein bam files to the sample sheet; you just set spikein to the chromosome names with the spike-in reads and the total number of reads on these chromosomes in the main bam files will be used to calculate normalization factors.

You can also limit the spike-in counts to pre-defined intervals in either the primary or Spikein bam files by setting spikein= to a GRanges object containing known intervals.

If you want to see the code that generates the example spikes objects, you can access it within the package:

file.edit(system.file('extra/GenerateSpikein.R',package='DiffBind'))

This script assumes that the BrundleData package is installed in a subdirectory called holding.

If you have a more specific spike-in scenario I can suggest how to include them.

ADD COMMENT

Login before adding your answer.

Traffic: 758 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6