Hi,
I've been going through the "Efficient genome searching with Biostrings and the BSgenome data packages" pdf document to get a grasp on how to search for a particular motif in a genome sequence.
Specifically, I am interesting in looking for the RGYW motif. Which actually has a "wobble" (I think this is the correct term) in the positions 1, 3, and 4. So basically, the R position can be either A/G. The Y position can be either C/T, and the W position can be either A/T.
For specifying the pattern, it seems that it has to be an explicit pattern (i.e. AGCA) for a Biostring object. Is there anyway to actually specify the pattern such that a given position could have multiple values? Something like A/GGC/TA/T. The other solution would be to explicit list out all of the patterns (i.e. AGCA, GGCA, etc) to do the searching. But if is there a way to do this "wobble" pattern, then it would be save some time, especially if it is a long pattern.
Thanks in advance,
Fong
Yes this is exactly what I need. Thanks.