Question: bamsignals - bamcoverage has no strand-specific option??
2
2.8 years ago by
gtho12340
New Zealand
gtho12340 wrote:

I am trying to create coverage plots from some strand specific BAM files and have run into the problem of the bamCoverage() function in the bamsignals package not having a strand specific option. This is not the case for the similar bamCount() and bamProfile() functions which only look at the 5 end of the read.

Is there a reason for this and if not how can I obtain the number of reads covering each base pair in a strand-specific manner so that it produces an output in the CountSignals format similar to the above mentioned functions?

Note: I posted the same question earlier and now I can't find it and get a 404 error when I try. So I thought I'd post again.

coverage bam bamsignals • 710 views
modified 2.8 years ago by helmuth0 • written 2.8 years ago by gtho12340
2

Just a note that the original question sent email to the maintainer, who had an autoreply generating a comment on the original question that noted that they were out of the office until (23 January?) which generated an email to the maintainer which generated an autoreply... and drastic steps were taken on our end to stop this loop; the maintainer will see your message when they return...

Answer: bamsignals - bamcoverage has no strand-specific option??
0
2.8 years ago by
helmuth0
helmuth0 wrote:

Dear @gtho123,

By design bamsignals::bamProfile() and bamsignals::bamCount() work on the read level which allows for the quantification of read counts on each strand individually. On the contrary, bamsignals::bamCoverage() works on the fragment level which does not allow for strand-specific DNA fragment coverage because the strand of one DNA fragment is effectively undefined.

However, if a strand-specific paired end protocol (e.g. RNA-seq) is used, a strand-specific fragment coverage can be still be defined where the first read in a pair defines the strand that gave rise to this fragment. In this case, you can achieve a strand-specific coverage by filtering on the SAMFLAG with the filteredFlag argument ( see https://broadinstitute.github.io/picard/explain-flags.html ):

• To get the fragment coverage for the first strand set filteredFlag = 32 and paired.end = TRUE

• To get the fragment coverage for the first strand set filteredFlag = 16 and paired.end = TRUE

If you deal with single end libraries, I suggest you use bamsignals::bamProfile() to filter on the strand and infer the coverage based on the expected fragment length ( see https://github.com/lamortenera/bamsignals/issues/16 ). For example,

bamCoverageFrags <- function(bampath, gr, fragLength=200, ss = TRUE, ...) {
prf <- bamProfile(bampath, gr, binsize=1, ss = TRUE, ...)
cov <- lapply(as.list(prf), function(p) {
l <- dim(p)[2]
x1 <- cumsum(as.numeric(p[1,]))
y1 <- c(rep(0, fragLength), x1[1:(l-fragLength)])
x2 <- cumsum(as.numeric(rev(p[2,])))
y2 <- c(rep(0, fragLength), x2[1:(l-fragLength)])
matrix(c(x1-y1, rev(x2-y2)), ncol = l, byrow = T,
dimnames = list(c("sense", "antisense"), NULL))
})
if (ss == TRUE) {
prf@signals <- cov
} else {
prf@signals <- lapply(cov, colSums)
}
invisible(prf)
}`

Johannes