Question

Cosmo Background model

0

Entering edit mode

Prashantha Hebbar ▴ 260

@prashantha-hebbar-3526

Last seen 6.1 years ago

Hi Oliver, I am interested in finding the consensus sequence of transcription factor binding sites of list probe sequences. I found that cosmo would be the best choice. But while going through cosmo vignette, I did not understand selection criteria to choose sequences for calculation of background model. I mean to say, If probe sequences are of promoter region, should sequences to be selected to calculate the back ground model, also be promoter sequences of the same probes? Regards, Prashantha Prashantha Hebbar Kiradi, Manipal Life Sciences Center, Manipal University, Email:prashantha.hebbar at manipal.edu [[alternative HTML version deleted]]

probe cosmo probe cosmo • 1.2k views

ADD COMMENT • link 15.8 years ago Prashantha Hebbar ▴ 260

score 0 · Answer 1 · 2010-03-06

Hi Oliver, Thanks for the quick reply. I have done an analysis with 10 probe sequences with varying lengths. But I find very large E-Value. Following is my result of cosmo without background model, Estimated position weight matrix: 1 2 3 4 5 6 7 8 A 0.1932 0.5002 0 0.3115 0.0288 0 0.5498 0.7435 C 0.0000 0.0000 1 0.0120 0.0000 0 0.0000 0.1056 G 0.0556 0.0000 0 0.0017 0.2582 1 0.0000 0.0000 T 0.7512 0.4998 0 0.6748 0.7130 0 0.4502 0.1509 Motif occurrences: E-value: 913225239 seq pos orient motif prob 1 MAPK1 171 1 TTCTTGTC 1.0000000 2 STARD5 533 1 TTCATGTC 0.9995751 3 NDFIP2 971 1 AACTTGTA 0.9957829 4 MGC23280 435 -1 TTCATGTA 0.9771451 5 POLR2J2 1209 -1 TTCCAGTA 0.9092027 6 MGC16291 519 -1 TTCAAGAA 0.9060919 7 ICAM5 651 1 TTCTTGAT 0.8572779 8 C6orf64 86 1 AACTGGTA 0.8383198 9 RAB2B 433 1 TACTTGAA 0.6048461 10 HIVEP3 89 1 TTCATGAA 0.3739610 cvOrder: Order of background Markov model estimated as order = 2 by CV eGetStart: Extracting starting values from sequence 10/10 fit: mType = OOPS conSet = 0 width = 8 nSitesNum = 1/1 starting value = 5/5 finalModel: fitting model for width 8 modType OOPS and conSet 0 finalModel: startNum 3 and nSitesNum 0 fit: mType = OOPS conSet = 0 width = 8 nSitesNum = 1/1 starting value = 4/5 Can you please tell me what could be the reason? is it because of the different lengths of probes or something else. Regards, Prashantha Hebbar Kiradi, Dept. of Biotechnology, Manipal Life Sciences Center, Manipal University, Manipal, India Email:prashantha.hebbar@manipal.edu --- On Sun, 3/7/10, Oliver Bembom <bembom@gmail.com> wrote: From: Oliver Bembom <bembom@gmail.com> Subject: Re: Cosmo Background model To: "Prashantha Hebbar" <prashantha.hebbar@yahoo.com> Date: Sunday, March 7, 2010, 4:42 AM Hi Prashanta,The background sequences would ideally be from sequences that are similar to the where the in question is to be found, but would not contain the motif itself. So maybe similar promoter sequences that you know don't contain your motif. By default in believe cosmo will just use the sequences you supply to search through. Hope this helps, Oliver On Mar 6, 2010, at 2:11, Prashantha Hebbar <prashantha.hebbar@yahoo.com> wrote: Hi Oliver, I am interested in finding the consensus sequence of transcription factor binding sites of list probe sequences. I found that cosmo would be the best choice. But while going through cosmo vignette, I did not understand selection criteria to choose sequences for calculation of background model. I mean to say, If probe sequences are of promoter region, should sequences to be selected to calculate the back ground model, also be promoter sequences of the same probes? Regards, Prashantha Prashantha Hebbar Kiradi, Manipal Life Sciences Center, Manipal University, Email:prashantha.hebbar at manipal.edu [[alternative HTML version deleted]]