Question: alternative to matchprobes
0
gravatar for kloytyno@mappi.helsinki.fi
8.3 years ago by
Dear all, I've found out that matchprobes has been deprecated and the function in question has been moved to Biostrings package, that also has been at least partially deprecated. Biostrings manual suggested functions matchPDict and vmatchPDict, and other pattern matching functions I found were grep-family of functions and pmatch. I'd love to hear your suggestions and information on what to use instead of matchprobes. Any help is much appreciated. best, Katja -- Katja L?ytynoja, M.Sc. Haartman Institute Department of Medical Genetics Biomedicum Helsinki P.O.Box 63 FIN-00014 University of Helsinki tel +358-9-19125110 gsm +358-50-4000324 fax +358-9-19125624 e-mail katja.loytynoja at helsinki.fi
matchprobes biostrings • 407 views
ADD COMMENTlink modified 8.3 years ago by Hervé Pagès ♦♦ 14k • written 8.3 years ago by kloytyno@mappi.helsinki.fi10
Answer: alternative to matchprobes
0
gravatar for Hervé Pagès
8.3 years ago by
Hervé Pagès ♦♦ 14k
United States
Hervé Pagès ♦♦ 14k wrote:
Hi Katja, On 11-05-14 06:58 AM, kloytyno at mappi.helsinki.fi wrote: > > Dear all, > > I've found out that matchprobes has been deprecated and the function in > question has been moved to Biostrings package, that also has been at > least partially deprecated. Biostrings manual suggested functions > matchPDict and vmatchPDict, and other pattern matching functions I found > were grep-family of functions and pmatch. I'd love to hear your > suggestions and information on what to use instead of matchprobes. Any > help is much appreciated. Note that the matchprobes() function is not deprecated at the moment (but might be in the near future). What to use exactly depends on what you want to do. More precisely: 1. What kind of sequences you have: DNA, RNA, other? 2. How many do you have? (a) Just a few short patterns and a few long subjects. More precisely you have just a few short strings that you want to match against just a few long strings. (b) A lot (millions) of short patterns and just a few long subjects. (c) Just a few short patterns that you want to match against a lot (millions) of short subjects. (d) A lot (millions) of very short patterns that you want to match against a lot of very short subjects. 3. Do all your patterns have the same length? 4. What kind of matching you want to perform: exact? or with mismatches? or maybe also with indels? 5. If your sequences are DNA or RNA, do they contain IUPAC ambiguity letters? In the patterns? in the subjects? In both? And if so, do you want to handle them as ambiguities? 6. What information you want to be returned: (a) the locations of all the matches (b) just the number of matches (c) only which patterns have matches I won't draw the decision tree that you could follow based on your answers to all these questions (because I don't have such tree yet, but it's something I need to add to the Biostrings doc, has been on my TODO list for a long time), but if you can provide the answers here I will try to direct you to the right function to use. Cheers, H. > > best, > Katja > > -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319
ADD COMMENTlink written 8.3 years ago by Hervé Pagès ♦♦ 14k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 126 users visited in the last hour