bamsignals - bamcoverage has no strand-specific option??
1
2
Entering edit mode
gtho123 ▴ 40
@gtho123-8872
Last seen 6.6 years ago
New Zealand

I am trying to create coverage plots from some strand specific BAM files and have run into the problem of the bamCoverage() function in the bamsignals package not having a strand specific option. This is not the case for the similar bamCount() and bamProfile() functions which only look at the 5` end of the read.

Is there a reason for this and if not how can I obtain the number of reads covering each base pair in a strand-specific manner so that it produces an output in the CountSignals format similar to the above mentioned functions?

 

Note: I posted the same question earlier and now I can't find it and get a 404 error when I try. So I thought I'd post again.

BAM bamsignals Coverage • 1.8k views
ADD COMMENT
2
Entering edit mode

Just a note that the original question sent email to the maintainer, who had an autoreply generating a comment on the original question that noted that they were out of the office until (23 January?) which generated an email to the maintainer which generated an autoreply... and drastic steps were taken on our end to stop this loop; the maintainer will see your message when they return...

ADD REPLY
0
Entering edit mode
helmuth • 0
@helmuth-12219
Last seen 7.4 years ago

Dear @gtho123,

Thanks for your question!

By design bamsignals::bamProfile() and bamsignals::bamCount() work on the read level which allows for the quantification of read counts on each strand individually. On the contrary, bamsignals::bamCoverage() works on the fragment level which does not allow for strand-specific DNA fragment coverage because the strand of one DNA fragment is effectively undefined.

However, if a strand-specific paired end protocol (e.g. RNA-seq) is used, a strand-specific fragment coverage can be still be defined where the first read in a pair defines the strand that gave rise to this fragment. In this case, you can achieve a strand-specific coverage by filtering on the SAMFLAG with the filteredFlag argument ( see https://broadinstitute.github.io/picard/explain-flags.html ):

  • To get the fragment coverage for the first strand set filteredFlag = 32 and paired.end = TRUE

  • To get the fragment coverage for the first strand set filteredFlag = 16 and paired.end = TRUE

If you deal with single end libraries, I suggest you use bamsignals::bamProfile() to filter on the strand and infer the coverage based on the expected fragment length ( see https://github.com/lamortenera/bamsignals/issues/16 ). For example,

bamCoverageFrags <- function(bampath, gr, fragLength=200, ss = TRUE, ...) {
  prf <- bamProfile(bampath, gr, binsize=1, ss = TRUE, ...)
  cov <- lapply(as.list(prf), function(p) {
    l <- dim(p)[2]
    x1 <- cumsum(as.numeric(p[1,]))
    y1 <- c(rep(0, fragLength), x1[1:(l-fragLength)])
    x2 <- cumsum(as.numeric(rev(p[2,])))
    y2 <- c(rep(0, fragLength), x2[1:(l-fragLength)])
    matrix(c(x1-y1, rev(x2-y2)), ncol = l, byrow = T,
           dimnames = list(c("sense", "antisense"), NULL))
  })
  if (ss == TRUE) {
    prf@signals <- cov
  } else {
    prf@signals <- lapply(cov, colSums)
  }
  invisible(prf)
}

Johannes

ADD COMMENT

Login before adding your answer.

Traffic: 573 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6