"Wobble" patterns for genome searching using Biostrings
Entering edit mode
fongchunchan ▴ 30
Last seen 7.5 years ago


I've been going through the "Efficient genome searching with Biostrings and the BSgenome data packages" pdf document to get a grasp on how to search for a particular motif in a genome sequence.

Specifically, I am interesting in looking for the RGYW motif. Which actually has a "wobble" (I think this is the correct term) in the positions 1, 3, and 4. So basically, the R position can be either A/G. The Y position can be either C/T, and the W position can be either A/T.

For specifying the pattern, it seems that it has to be an explicit pattern (i.e. AGCA) for a Biostring object. Is there anyway to actually specify the pattern such that a given position could have multiple values? Something like A/GGC/TA/T. The other solution would be to explicit list out all of the patterns (i.e. AGCA, GGCA, etc) to do the searching. But if is there a way to do this "wobble" pattern, then it would be save some time, especially if it is a long pattern.

Thanks in advance,


biostrings bsgenome motif RGYW • 1.4k views
Entering edit mode
Last seen 15 months ago
United States

In the help page from ?Biostrings::matchPattern, under the description for the fixed parameter, I see:

If TRUE (the default), an IUPAC ambiguity code in the pattern can only match the same code in the subject, and vice versa. If FALSE, an IUPAC ambiguity code in the pattern can match any letter in the subject that is associated with the code, and vice versa. See ?`lowlevel-matching` for more information.

This should get you what you're after, no?


Entering edit mode

Yes this is exactly what I need. Thanks. 


Login before adding your answer.

Traffic: 410 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6