a problem with trimLRPatterns
1
0
Entering edit mode
wang peter ★ 2.0k
@wang-peter-4647
Last seen 8.4 years ago
i donot know why 0.1 mismatch can work to trim the correct adapter, but if i set the max.Rmismatch from 1 to 9, it cannot work thanks > subject = "GGGAGTAAGAAAGGCACTGAAGGCACTATCAATAGATCGGAAGAGCGGTT" > Rpattern = "CTGTAGGCACCATCAATAGATCGGAAGAGCGGTTCAGAAGGAATGCCGAG" > trimLRPatterns(Rpattern = Rpattern, subject = subject, max.Rmismatch=1,with.Lindels=TRUE) [1] "GGGAGTAAGAAAGGCACTGAAGGCACTATCAATAGATCGGAAGAGCGGTT" > trimLRPatterns(Rpattern = Rpattern, subject = subject, max.Rmismatch=0.1,with.Lindels=TRUE) [1] "GGGAGTAAGAAAGGCA" > trimLRPatterns(Rpattern = Rpattern, subject = subject, max.Rmismatch=2,with.Lindels=TRUE) [1] "GGGAGTAAGAAAGGCACTGAAGGCACTATCAATAGATCGGAAGAGCGGTT" > trimLRPatterns(Rpattern = Rpattern, subject = subject, max.Rmismatch=3,with.Lindels=TRUE) [1] "GGGAGTAAGAAAGGCACTGAAGGCACTATCAATAGATCGGAAGAGCGGTT" -- shan gao Room 231(Dr.Fei lab) Boyce Thompson Institute Cornell University Tower Road, Ithaca, NY 14853-1801 Office phone: 1-607-254-1267(day) Official email:sg839 at cornell.edu Facebook:http://www.facebook.com/profile.php?id=100001986532253
• 634 views
0
Entering edit mode
@harris-a-jaffee-3972
Last seen 8.3 years ago
United States
The help page could probably use some annotation to guide the reader, but the mismatch arguments are taken to fall into one of 3 cases: Either an integer vector of length 'nLp = nchar(Lpattern)' representing an absolute number of mismatches (or edit distance if 'with.Lindels' is 'TRUE') ... or a single numeric value in the interval '[0, 1)' ... Otherwise, 'max.Lmismatch' is treated as an integer vector where negative numbers are used to prevent trimming at the 'i'-th location. When an input integer vector is shorter than 'nLp', it is augmented with enough '-1's at the beginning to bring its length up to 'nLp'. Elements of 'max.Lmismatch' beyond the first 'nLp' are ignored. You are using cases 2 and 3 in what you are trying here. A single numeric value (e.g. 0.1) gets expanded to as.integer(mismatch * 1:nchar(pattern)) An integer (1, 2, ... 9) [or a vector of length < nchar(pattern)] is augmented to a vector of length nchar(pattern) by filling at the bottom with -1, thus preventing matches and trimming at all of those stages. Therefore, you cannot get any trimming by setting the mismatch value to an integer, say M, unless the whole pattern lies (at whichever end) within an edit distance of M from the subject. No partial patterns are even tested. Instead of a single integer M, you might try rep(M, nchar(pattern)) Say, start with M=9, which will give some trimming, and lower M until you get no trimming. I think it's better to use a rate, than an integer vector, unless you want to refine what a rate expands to (above). On Mar 9, 2012, at 2:15 PM, wang peter wrote: > i donot know why 0.1 mismatch can work to trim the correct adapter, but > if i set the max.Rmismatch from 1 to 9, it cannot work > thanks > >> subject = "GGGAGTAAGAAAGGCACTGAAGGCACTATCAATAGATCGGAAGAGCGGTT" >> Rpattern = "CTGTAGGCACCATCAATAGATCGGAAGAGCGGTTCAGAAGGAATGCCGAG" > >> trimLRPatterns(Rpattern = Rpattern, subject = subject, max.Rmismatch=1,with.Lindels=TRUE) > [1] "GGGAGTAAGAAAGGCACTGAAGGCACTATCAATAGATCGGAAGAGCGGTT" >> trimLRPatterns(Rpattern = Rpattern, subject = subject, max.Rmismatch=0.1,with.Lindels=TRUE) > [1] "GGGAGTAAGAAAGGCA" >> trimLRPatterns(Rpattern = Rpattern, subject = subject, max.Rmismatch=2,with.Lindels=TRUE) > [1] "GGGAGTAAGAAAGGCACTGAAGGCACTATCAATAGATCGGAAGAGCGGTT" >> trimLRPatterns(Rpattern = Rpattern, subject = subject, max.Rmismatch=3,with.Lindels=TRUE) > [1] "GGGAGTAAGAAAGGCACTGAAGGCACTATCAATAGATCGGAAGAGCGGTT" > > -- > shan gao > Room 231(Dr.Fei lab) > Boyce Thompson Institute > Cornell University > Tower Road, Ithaca, NY 14853-1801 > Office phone: 1-607-254-1267(day) > Official email:sg839 at cornell.edu > Facebook:http://www.facebook.com/profile.php?id=100001986532253 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor