alternative to matchprobes
1
0
Entering edit mode
@kloytynomappihelsinkifi-4649
Last seen 9.6 years ago
Dear all, I've found out that matchprobes has been deprecated and the function in question has been moved to Biostrings package, that also has been at least partially deprecated. Biostrings manual suggested functions matchPDict and vmatchPDict, and other pattern matching functions I found were grep-family of functions and pmatch. I'd love to hear your suggestions and information on what to use instead of matchprobes. Any help is much appreciated. best, Katja -- Katja L?ytynoja, M.Sc. Haartman Institute Department of Medical Genetics Biomedicum Helsinki P.O.Box 63 FIN-00014 University of Helsinki tel +358-9-19125110 gsm +358-50-4000324 fax +358-9-19125624 e-mail katja.loytynoja at helsinki.fi
matchprobes Biostrings matchprobes Biostrings • 660 views
ADD COMMENT
0
Entering edit mode
@herve-pages-1542
Last seen 15 hours ago
Seattle, WA, United States
Hi Katja, On 11-05-14 06:58 AM, kloytyno at mappi.helsinki.fi wrote: > > Dear all, > > I've found out that matchprobes has been deprecated and the function in > question has been moved to Biostrings package, that also has been at > least partially deprecated. Biostrings manual suggested functions > matchPDict and vmatchPDict, and other pattern matching functions I found > were grep-family of functions and pmatch. I'd love to hear your > suggestions and information on what to use instead of matchprobes. Any > help is much appreciated. Note that the matchprobes() function is not deprecated at the moment (but might be in the near future). What to use exactly depends on what you want to do. More precisely: 1. What kind of sequences you have: DNA, RNA, other? 2. How many do you have? (a) Just a few short patterns and a few long subjects. More precisely you have just a few short strings that you want to match against just a few long strings. (b) A lot (millions) of short patterns and just a few long subjects. (c) Just a few short patterns that you want to match against a lot (millions) of short subjects. (d) A lot (millions) of very short patterns that you want to match against a lot of very short subjects. 3. Do all your patterns have the same length? 4. What kind of matching you want to perform: exact? or with mismatches? or maybe also with indels? 5. If your sequences are DNA or RNA, do they contain IUPAC ambiguity letters? In the patterns? in the subjects? In both? And if so, do you want to handle them as ambiguities? 6. What information you want to be returned: (a) the locations of all the matches (b) just the number of matches (c) only which patterns have matches I won't draw the decision tree that you could follow based on your answers to all these questions (because I don't have such tree yet, but it's something I need to add to the Biostrings doc, has been on my TODO list for a long time), but if you can provide the answers here I will try to direct you to the right function to use. Cheers, H. > > best, > Katja > > -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319
ADD COMMENT

Login before adding your answer.

Traffic: 911 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6