Question

bamsignals - bamcoverage has no strand-specific option??

2

Entering edit mode

gtho123 ▴ 40

@gtho123-8872

Last seen 6.0 years ago

New Zealand

I am trying to create coverage plots from some strand specific BAM files and have run into the problem of the bamCoverage() function in the bamsignals package not having a strand specific option. This is not the case for the similar bamCount() and bamProfile() functions which only look at the 5` end of the read.

Is there a reason for this and if not how can I obtain the number of reads covering each base pair in a strand-specific manner so that it produces an output in the CountSignals format similar to the above mentioned functions?

Note: I posted the same question earlier and now I can't find it and get a 404 error when I try. So I thought I'd post again.

BAM bamsignals Coverage • 1.6k views

ADD COMMENT • link updated 7.3 years ago by helmuth • 0 • written 7.3 years ago by gtho123 ▴ 40

2

Entering edit mode

Just a note that the original question sent email to the maintainer, who had an autoreply generating a comment on the original question that noted that they were out of the office until (23 January?) which generated an email to the maintainer which generated an autoreply... and drastic steps were taken on our end to stop this loop; the maintainer will see your message when they return...

ADD REPLY • link 7.3 years ago Martin Morgan 25k

score 0 · Answer 1 · 2017-01-25

Dear @gtho123,

Thanks for your question!

By design bamsignals::bamProfile() and bamsignals::bamCount() work on the read level which allows for the quantification of read counts on each strand individually. On the contrary, bamsignals::bamCoverage() works on the fragment level which does not allow for strand-specific DNA fragment coverage because the strand of one DNA fragment is effectively undefined.

However, if a strand-specific paired end protocol (e.g. RNA-seq) is used, a strand-specific fragment coverage can be still be defined where the first read in a pair defines the strand that gave rise to this fragment. In this case, you can achieve a strand-specific coverage by filtering on the SAMFLAG with the filteredFlag argument ( see https://broadinstitute.github.io/picard/explain-flags.html ):

To get the fragment coverage for the first strand set filteredFlag = 32 and paired.end = TRUE
To get the fragment coverage for the first strand set filteredFlag = 16 and paired.end = TRUE

If you deal with single end libraries, I suggest you use bamsignals::bamProfile() to filter on the strand and infer the coverage based on the expected fragment length ( see https://github.com/lamortenera/bamsignals/issues/16 ). For example,

bamCoverageFrags <- function(bampath, gr, fragLength=200, ss = TRUE, ...) {
  prf <- bamProfile(bampath, gr, binsize=1, ss = TRUE, ...)
  cov <- lapply(as.list(prf), function(p) {
    l <- dim(p)[2]
    x1 <- cumsum(as.numeric(p[1,]))
    y1 <- c(rep(0, fragLength), x1[1:(l-fragLength)])
    x2 <- cumsum(as.numeric(rev(p[2,])))
    y2 <- c(rep(0, fragLength), x2[1:(l-fragLength)])
    matrix(c(x1-y1, rev(x2-y2)), ncol = l, byrow = T,
           dimnames = list(c("sense", "antisense"), NULL))
  })
  if (ss == TRUE) {
    prf@signals <- cov
  } else {
    prf@signals <- lapply(cov, colSums)
  }
  invisible(prf)
}

Johannes