Search
Question: Match Pattern Vector to Subject Vector
0
gravatar for Dario Strbenac
16 months ago by
Dario Strbenac1.4k
Australia
Dario Strbenac1.4k wrote:

Given that vmatchPattern doesn't work with a vector as the pattern and vmatchPDict isn't implemented, are there alternatives in R that don't involve using short read mapping algorithms and building indexes?

ADD COMMENTlink modified 16 months ago by Martin Morgan ♦♦ 21k • written 16 months ago by Dario Strbenac1.4k
0
gravatar for Martin Morgan
16 months ago by
Martin Morgan ♦♦ 21k
United States
Martin Morgan ♦♦ 21k wrote:

Maybe you can trie AhoCorasickTrie ; would be good to know if this works for your purposes.

ADD COMMENTlink written 16 months ago by Martin Morgan ♦♦ 21k
1

Thanks for the suggestion. However,  AhoCorasickSearch  currently doesn't support mismatches nor indels. I doubt that it would be useful for many genomics applications. I notice that my question is basically the same as matching of AAStringSet vs. another AAStringSet. It might be a common use case worth an optimised solution in Biostrings.
 

ADD REPLYlink written 16 months ago by Dario Strbenac1.4k

Good to know about the limitations. Another possibility is to 'unlist' one of the StringSets into in to a single *String separated by nonsense (e.g., poly-N), match, then relist the result as appropriate.

ADD REPLYlink written 16 months ago by Martin Morgan ♦♦ 21k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 233 users visited in the last hour