Hello,
Can anyone help me understand how to use csaw to 1) filter my BAM file reads to retain only windows within MACS called peaks, 2) then count the number of reads in 100bp bins within a 5kbp region around each peak?
I have my BAM reads file, my BED peaks file and then an additional BED file which has 5kbp regions around each peak.
I've tried to filter my reads using the peaks as follows-
counts <- windowCounts(p300.bam, width=10, ext=frag.len, param=param)
> locality <- regionCounts(p300.bam, resize(rowRanges(counts), fix="center", 100))
Warning messages:
1: In valid.GenomicRanges.seqinfo(x, suggest.trim = TRUE) :
GRanges object contains 4 out-of-bound ranges located on
sequences chr11, chr18, chr5, and chrX. Note that only
ranges located on a non-circular sequence whose length is
not NA can be considered out-of-bound (use seqlengths()
and isCircular() to get the lengths and circularity flags
of the underlying sequences). You can use trim() to trim
these ranges. See ?`trim,GenomicRanges-method` for more
information.
2: In valid.GenomicRanges.seqinfo(x, suggest.trim = TRUE) :
GRanges object contains 4 out-of-bound ranges located on
sequences chr11, chr18, chr5, and chrX. Note that only
ranges located on a non-circular sequence whose length is
not NA can be considered out-of-bound (use seqlengths()
and isCircular() to get the lengths and circularity flags
of the underlying sequences). You can use trim() to trim
these ranges. See ?`trim,GenomicRanges-method` for more
information.
> stats <- filterWindows(counts, locality, type="local")
Error in .checkLibSizes(data, background) :
data and background totals should be identical
Many thanks for any help,
Julia
Thank you very much for your help. I understand now how to build the matrices, but I'm still a bit lost on how to filter for reads that fall within peaks. Would you mind going through that part one more time?
Very much appreciated,
Julia
Filter for reads, or filter for windows? If you want to identify individual reads that fall into peaks, then you need to construct a
GRanges
object from the read intervals (e.g., usingextractReads
) and identify overlaps with the peak intervals usingfindOverlaps
. If you want to identify windows that fall into peaks, then you can identify overlaps between the window intervals (fromrowRanges
) and the peak intervals withfindOverlaps
. Read the documentation for thefindOverlaps
function if you're still not sure - it's a very useful utility so it's worth understanding.