Does "exclude" argument break BSgenome::vmatchPattern?
1
0
Entering edit mode
Aditya ▴ 160
@aditya-7667
Last seen 2.5 years ago
Germany

Using the exclude argument seems to break BSgenome::vmatchPattern:

# vmatchPattern on full BSgenome works fine:

    pattern <- 'GTAACGGCAGACTTCTCCAC'
    bs <- BSgenome.Hsapiens.UCSC.hg38::BSgenome.Hsapiens.UCSC.hg38
    results <- BSgenome::vmatchPattern(
                   pattern, bs, min.mismatch=0, max.mismatch=3)
    results[GenomicRanges::seqnames(results)=='chr11']


    GRanges object with 8 ranges and 0 metadata columns:
      seqnames              ranges strand
         <Rle>           <IRanges>  <Rle>
      [1]    chr11     5226984-5227003      +
      [2]    chr11     5269853-5269872      +
      [3]    chr11   59369899-59369918      +
      [4]    chr11   72402607-72402626      +
      [5]    chr11   91170709-91170728      +
      [6]    chr11 108624292-108624311      +
      [7]    chr11 132892225-132892244      +
      [8]    chr11   87142523-87142542      -
      -------
      seqinfo: 455 sequences from an unspecified genome


# But using the `exlude` argument seems to break it:

    require(magrittr)
    excl <- sprintf('chr%s', c(as.character(1:22), 'X', 'Y', 'M')) %>% setdiff('chr11')
    results <- BSgenome::vmatchPattern(
                   pattern, bs, min.mismatch=0, max.mismatch=3, exclude=excl)


    GRanges object with 0 ranges and 0 metadata columns:
       seqnames    ranges strand
         <Rle> <IRanges>  <Rle>
       -------
       seqinfo: 455 sequences from an unspecified genome
BSgenome • 725 views
ADD COMMENT
2
Entering edit mode
@james-w-macdonald-5106
Last seen 13 hours ago
United States

Seems to want a regex rather than a match, so you need to be careful.

> exc <- paste0("^chr", c(1:10,12:22, "X","Y","M"), "$")
> exc
 [1] "^chr1$"  "^chr2$"  "^chr3$"  "^chr4$"  "^chr5$"  "^chr6$"  "^chr7$"
 [8] "^chr8$"  "^chr9$"  "^chr10$" "^chr12$" "^chr13$" "^chr14$" "^chr15$"
[15] "^chr16$" "^chr17$" "^chr18$" "^chr19$" "^chr20$" "^chr21$" "^chr22$"
[22] "^chrX$"  "^chrY$"  "^chrM$"
> results2 <- vmatchPattern(pattern, Hsapiens, min.mismatch = 0, max.mismatch = 3, exclude = exc)
> results2[seqnames(results2) == "chr11"]
GRanges object with 8 ranges and 0 metadata columns:
      seqnames              ranges strand
         <Rle>           <IRanges>  <Rle>
  [1]    chr11     5226984-5227003      +
  [2]    chr11     5269853-5269872      +
  [3]    chr11   59369899-59369918      +
  [4]    chr11   72402607-72402626      +
  [5]    chr11   91170709-91170728      +
  [6]    chr11 108624292-108624311      +
  [7]    chr11 132892225-132892244      +
  [8]    chr11   87142523-87142542      -
  -------
  seqinfo: 455 sequences from an unspecified genome
ADD COMMENT

Login before adding your answer.

Traffic: 939 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6