Hi Oliver,
Thanks for the quick reply.
I have done an analysis with 10 probe sequences with varying lengths.
But I find very large E-Value. Following is my result of cosmo without
background model,
Estimated position weight matrix:
1 2 3 4 5 6 7 8
A 0.1932 0.5002 0 0.3115 0.0288 0 0.5498 0.7435
C 0.0000 0.0000 1 0.0120 0.0000 0 0.0000 0.1056
G 0.0556 0.0000 0 0.0017 0.2582 1 0.0000 0.0000
T 0.7512 0.4998 0 0.6748 0.7130 0 0.4502 0.1509
Motif occurrences:
E-value: 913225239
seq pos orient motif prob
1 MAPK1 171 1 TTCTTGTC 1.0000000
2 STARD5 533 1 TTCATGTC 0.9995751
3 NDFIP2 971 1 AACTTGTA 0.9957829
4 MGC23280 435 -1 TTCATGTA 0.9771451
5 POLR2J2 1209 -1 TTCCAGTA 0.9092027
6 MGC16291 519 -1 TTCAAGAA 0.9060919
7 ICAM5 651 1 TTCTTGAT 0.8572779
8 C6orf64 86 1 AACTGGTA 0.8383198
9 RAB2B 433 1 TACTTGAA 0.6048461
10 HIVEP3 89 1 TTCATGAA 0.3739610
cvOrder: Order of background Markov model estimated as order = 2 by CV
eGetStart: Extracting starting values from sequence 10/10
fit: mType = OOPS conSet = 0 width = 8 nSitesNum = 1/1 starting value
= 5/5
finalModel: fitting model for width 8 modType OOPS and conSet 0
finalModel: startNum 3 and nSitesNum 0
fit: mType = OOPS conSet = 0 width = 8 nSitesNum = 1/1 starting value
= 4/5
Can you please tell me what could be the reason? is it because of the
different lengths of probes or something else.
Regards,
Prashantha Hebbar Kiradi,
Dept. of Biotechnology,
Manipal Life Sciences Center,
Manipal University,
Manipal, India
Email:prashantha.hebbar@manipal.edu
--- On Sun, 3/7/10, Oliver Bembom <bembom@gmail.com> wrote:
From: Oliver Bembom <bembom@gmail.com>
Subject: Re: Cosmo Background model
To: "Prashantha Hebbar" <prashantha.hebbar@yahoo.com>
Date: Sunday, March 7, 2010, 4:42 AM
Hi Prashanta,The background sequences would ideally be from sequences
that are similar to the where the in question is to be found, but
would not contain the motif itself. So maybe similar promoter
sequences that you know don't contain your motif. By default in
believe cosmo will just use the sequences you supply to search
through. Hope this helps,
Oliver
On Mar 6, 2010, at 2:11, Prashantha Hebbar
<prashantha.hebbar@yahoo.com> wrote:
Hi Oliver,
I am interested in finding the consensus sequence of transcription
factor binding sites of list probe sequences. I found that cosmo would
be the best choice. But while going through cosmo vignette, I did not
understand selection criteria to choose sequences for calculation of
background model. I mean to say, If probe sequences are of promoter
region, should sequences to be selected to calculate the back ground
model, also be promoter sequences of the same probes?
Regards,
Prashantha
Prashantha Hebbar Kiradi,
Manipal Life Sciences Center,
Manipal University,
Email:prashantha.hebbar at manipal.edu
[[alternative HTML version deleted]]