question about using matchPattern in the Biostrings package
1
0
Entering edit mode
Xiao Shi ▴ 90
@xiao-shi-1184
Last seen 9.7 years ago
Hi everybody, I have a question about using the matchDNAPattern in the Biostrings package. I have a DNA string,eg a=DNAString("AGCTGACTCAGTGGCTTGCT"),and i want to find a pattern,eg p="AGCT" with one mismatch.So i use : > matchDNAPattern("AGCT",a,mismatch=1) Object of class BioString with Pattern alphabet: -TGCANBDHKMRSVWY Values: [1] AGC [2] AGCT [3] GCTG [4] GACT [5] CAGT [6] GGCT [7] TGCT But i just want those subsequence who's base order is coherent to the search pattern.eg,AGCT is a perfect match;GGCT,CGCT,TGCT with one mismatch;and AGC,GCTG,CAGT are also in the result list,but the position of base are changed compared to the search pattern,i don't want them. How can i achieve this goal? [[alternative HTML version deleted]]
• 765 views
ADD COMMENT
0
Entering edit mode
rgentleman ★ 5.5k
@rgentleman-7725
Last seen 9.0 years ago
United States
Hi Xiao, Right now the code seems not to do quite what you want, it allows for insertions, substitutions and deletions, where if I understand you are only interested in substitutions. I will see about modifying it so that you can specify each of these parameters directly - that is the number of insertions, deletions and substitutions, rather than the current "global" setting. But that will take a few days, and will surely miss the release. Best wishes Robert On May 13, 2005, at 9:47 PM, Xiao Shi wrote: > Hi everybody, > I have a question about using the matchDNAPattern in the Biostrings > package. > I have a DNA string,eg a=DNAString("AGCTGACTCAGTGGCTTGCT"),and i want > to > find a pattern,eg p="AGCT" with one mismatch.So i use : >> matchDNAPattern("AGCT",a,mismatch=1) > Object of class BioString with > Pattern alphabet: -TGCANBDHKMRSVWY > Values: > [1] AGC > [2] AGCT > [3] GCTG > [4] GACT > [5] CAGT > [6] GGCT > [7] TGCT > But i just want those subsequence who's base order is coherent to the > search > pattern.eg,AGCT is a perfect match;GGCT,CGCT,TGCT with one mismatch;and > AGC,GCTG,CAGT are also in the result list,but the position of base are > changed compared to the search pattern,i don't want them. > How can i achieve this goal? > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > > +--------------------------------------------------------------------- -- ----------------+ | Robert Gentleman phone: (206) 667-7700 | | Head, Program in Computational Biology fax: (206) 667-1319 | | Division of Public Health Sciences office: M2-B865 | | Fred Hutchinson Cancer Research Center | | email: rgentlem@fhcrc.org | +--------------------------------------------------------------------- -- ----------------+
ADD COMMENT

Login before adding your answer.

Traffic: 707 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6