Search
Question: matchPWM on DNAStringSet rather than just one sequence?
0
gravatar for Jake
2.2 years ago by
Jake60
United States
Jake60 wrote:

I have a couple PWMs for RNA binding proteins and I also have the sequences of candidate UTRs in different groups as a DNAStringSet. I'd like to see how many of the UTRs (and which ones) in each group match a given PWM. However, it looks like the matchPWM function in Biostrings only supports a single sequence rather than a DNAStringSet. Is there a way to do this besides sticking all of my sequences together, matching, breaking them apart or looping through each sequence?

Thanks

ADD COMMENTlink modified 2.2 years ago • written 2.2 years ago by Jake60
0
gravatar for Mike Smith
2.2 years ago by
Mike Smith2.8k
EMBL Heidelberg / de.NBI
Mike Smith2.8k wrote:

I'm pretty sure you can use sapply for this, something along the lines of:  

sapply(my_DNAStringSet, FUN = matchPWM, pwm = my_PWM)
ADD COMMENTlink written 2.2 years ago by Mike Smith2.8k
0
gravatar for Jake
2.2 years ago by
Jake60
United States
Jake60 wrote:

That works, but is incredibly slow once I start looping through all of my UTRs and even a few RNA binding proteins. Is there another bioconductor package or program outside that would be significantly faster?

ADD COMMENTlink written 2.2 years ago by Jake60

Hi Jake,

Have you tried the "sticking all of my sequences together, matching, breaking them apart" approach? It should be significantly faster than looping.

H.

ADD REPLYlink written 2.2 years ago by Hervé Pagès ♦♦ 13k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 146 users visited in the last hour