Hi, i am trying to discover a desired motif in a set of 251 sequences but my results are not consistent. In some runs i get desired motif but in other runs it disappears. Now i am trying to find the motif with some motif as a seed in my DNAstringset object of sequences.
My seeded motif is given here is present in motif.txt file
A 0.4619 0.927 0.8053 0.9305 0.4262 0.6623 0.4405 0.8018 0.7588 0.8912 0.7517 0.8268 0.4834 0.6158
C 0.0148 0.0077 0.0291 0.0148 0.3046 0.0112 0.022 0.0327 0.0148 0.022 0.0148 0.0184 0.0828 0.0291
G 0.1114 0.0148 0.0077 0.0291 0.0935 0.1221 0.0184 0.1078 0.0685 0.0148 0.1543 0.0291 0.1114 0.14
T 0.4119 0.0506 0.1579 0.0255 0.1758 0.2044 0.5192 0.0577 0.1579 0.072 0.0792 0.1257 0.3224 0.2151
Please tell me possible command to get similar motif to seeded motif in DNAstringset.
GADEM()function has a
seedargument, and, according to its man page, "when a seed is specified, the run results are deterministic". This is a good feature that all randomized algorithms in Bioconductor are expected to have in order to allow reproducible research. Are you sure the non-deterministic behavior observed by the OP is not a bug?
There are 2 types of seeds with the `GADEM()` function: the `seed` argument you mention that make the results deterministic and the `Spwm` param that let the user use a motif as a starting point for the genetic algorithm (the other option is to let the `GADEM()` function generate the starting motifs with the most frequent k-mers in the sequences). Based on the ininital question, I assumed Vinod was talking was talking about the `Spwm` param. If it's not the case, then it's clearly a bug as you said.
Yes Vinod is saying that he's using a seeded motif (and is showing the motif). Are you saying that when the user gives the algorithm a seeded motif then the algorithm is not deterministic anymore? Just to clarify, deterministic means that 2 runs with exactly the same input (in particular same
Spwmargs) will produce the same output.
What I meant is that if the user *only* provide a seeded motif through the `Spwm` arg, then it's not deterministic (a seeded motif but no seed). The `seed` arg should determine if the algorithm will be deterministic independently of the values of any other args.
In the case of the OP, I assumed the `Spwm` arg was used without the `seed` arg since the results were different after each run. But I could be wrong, and in that case it would be a bug like you mention in your first comment.
seedargument has a default value of 1 which means that if the user doesn't supply it it will be set to 1. Are you saying that when
seed=1the algorithm is not determinitstic? Is the value 1 treated in a special way? When I look at the implementation of the
GADEM()function, it doesn't seem so: what I see is that only if
seedis set to NULL is the call to
set.seed(seed)skipped. So it looks like the algorithm is not deterministic only when the user supplies
seed=NULL. As a consequence, if the user *only* provides a seeded motif through the
Spwmarg (i.e. a seeded motif but no seed) then the algorithm should be deterministic. Am I missing something?