Search
Question: bamsignals - bamcoverage has no strand-specific option??
2
gravatar for gtho123
10 months ago by
gtho12330
New Zealand
gtho12330 wrote:

I am trying to create coverage plots from some strand specific BAM files and have run into the problem of the bamCoverage() function in the bamsignals package not having a strand specific option. This is not the case for the similar bamCount() and bamProfile() functions which only look at the 5` end of the read.

Is there a reason for this and if not how can I obtain the number of reads covering each base pair in a strand-specific manner so that it produces an output in the CountSignals format similar to the above mentioned functions?

 

Note: I posted the same question earlier and now I can't find it and get a 404 error when I try. So I thought I'd post again.

ADD COMMENTlink modified 9 months ago by helmuth0 • written 10 months ago by gtho12330
2

Just a note that the original question sent email to the maintainer, who had an autoreply generating a comment on the original question that noted that they were out of the office until (23 January?) which generated an email to the maintainer which generated an autoreply... and drastic steps were taken on our end to stop this loop; the maintainer will see your message when they return...

ADD REPLYlink written 10 months ago by Martin Morgan ♦♦ 20k
0
gravatar for helmuth
9 months ago by
helmuth0
helmuth0 wrote:

Dear @gtho123,

Thanks for your question!

By design bamsignals::bamProfile() and bamsignals::bamCount() work on the read level which allows for the quantification of read counts on each strand individually. On the contrary, bamsignals::bamCoverage() works on the fragment level which does not allow for strand-specific DNA fragment coverage because the strand of one DNA fragment is effectively undefined.

However, if a strand-specific paired end protocol (e.g. RNA-seq) is used, a strand-specific fragment coverage can be still be defined where the first read in a pair defines the strand that gave rise to this fragment. In this case, you can achieve a strand-specific coverage by filtering on the SAMFLAG with the filteredFlag argument ( see https://broadinstitute.github.io/picard/explain-flags.html ):

  • To get the fragment coverage for the first strand set filteredFlag = 32 and paired.end = TRUE

  • To get the fragment coverage for the first strand set filteredFlag = 16 and paired.end = TRUE

If you deal with single end libraries, I suggest you use bamsignals::bamProfile() to filter on the strand and infer the coverage based on the expected fragment length ( see https://github.com/lamortenera/bamsignals/issues/16 ). For example,

bamCoverageFrags <- function(bampath, gr, fragLength=200, ss = TRUE, ...) {
  prf <- bamProfile(bampath, gr, binsize=1, ss = TRUE, ...)
  cov <- lapply(as.list(prf), function(p) {
    l <- dim(p)[2]
    x1 <- cumsum(as.numeric(p[1,]))
    y1 <- c(rep(0, fragLength), x1[1:(l-fragLength)])
    x2 <- cumsum(as.numeric(rev(p[2,])))
    y2 <- c(rep(0, fragLength), x2[1:(l-fragLength)])
    matrix(c(x1-y1, rev(x2-y2)), ncol = l, byrow = T,
           dimnames = list(c("sense", "antisense"), NULL))
  })
  if (ss == TRUE) {
    prf@signals <- cov
  } else {
    prf@signals <- lapply(cov, colSums)
  }
  invisible(prf)
}

Johannes

ADD COMMENTlink modified 9 months ago • written 9 months ago by helmuth0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 131 users visited in the last hour