Question

Average coverage between multiple samples

1

Entering edit mode

Jake ▴ 90

@jake-7236

Last seen 20 months ago

United States

Hi,

I have 2 bed files from 2 control samples that have single nucleotide reads over a region. I would like to calculate the average coverage (and standard deviation) between the 2 samples at each base. I have calculated the size factors for each library to normalize the reads before averaging. I have also extended the reads by 100 basepairs to smooth out the coverage. I can easily make bigwigs of the individual normalized coverage, but haven't found a good way to average a coverage Rle object.

norm <- estimateSizeFactorsForMatrix(count_mat)

ctrl1 <- import('ctrl1.bed')
cov1 <- coverage(ctrl1 + 100)/norm['ctrl1']
cov_gr1 <- GRanges(cov1)
export(object = cov_gr1, con = 'ctrl1_cov.bw')

ctrl2 <- import('ctrl2.bed')
cov2 <- coverage(ctrl2 + 100)/norm['ctrl2']
cov_gr2 <- GRanges(cov2)
export(object = cov_gr2, con = 'ctrl2_cov.bw')

Essentially I want to calculate at each base

mean(c(cov1, cov2))
sd(c(cov1, cov2))

Thank you

GenomicRanges • 887 views

ADD COMMENT • link updated 3.4 years ago by Michael Lawrence ★ 11k • written 3.4 years ago by Jake ▴ 90

score 0 · Answer 1 · 2020-12-18

0

Entering edit mode

Michael Lawrence ★ 11k

@michael-lawrence-3846

Last seen 2.4 years ago

United States

If you want the average at each base, it would be (cov1 + cov2)/2, although you will need to be careful to ensure that the coverage vectors are the same length (see ?coverage). If you just want the average across both samples and all positions, then your last code block should do the trick.

ADD COMMENT • link 3.4 years ago Michael Lawrence ★ 11k