Match Pattern Vector to Subject Vector
1
0
Entering edit mode
Dario Strbenac ★ 1.5k
@dario-strbenac-5916
Last seen 26 days ago
Australia

Given that vmatchPattern doesn't work with a vector as the pattern and vmatchPDict isn't implemented, are there alternatives in R that don't involve using short read mapping algorithms and building indexes?

Biostings Biostrings • 614 views
0
Entering edit mode
@martin-morgan-1513
Last seen 3 days ago
United States

Maybe you can trie AhoCorasickTrie ; would be good to know if this works for your purposes.

1
Entering edit mode

Thanks for the suggestion. However,  AhoCorasickSearch  currently doesn't support mismatches nor indels. I doubt that it would be useful for many genomics applications. I notice that my question is basically the same as matching of AAStringSet vs. another AAStringSet. It might be a common use case worth an optimised solution in Biostrings.

0
Entering edit mode

Good to know about the limitations. Another possibility is to 'unlist' one of the StringSets into in to a single *String separated by nonsense (e.g., poly-N), match, then relist the result as appropriate.