Motif search -- access to JASPAR, MotIV package, more TF-PWM relationships?
1
0
Entering edit mode
Paul Shannon ▴ 750
@paul-shannon-5161
Last seen 9.6 years ago
(redirecting this back to the Bioc list?) Hi Nooshin, The 'bulk' approach is not quite so ready as I predicted. I might have something by the end of the week. As for mapping between PWMs and TFs, I have most often done this with 'tom-tom' from the meme website. But I just discovered what looks like a good -- maybe better -- approach: the Bioconductor MotIV package, which includes a 2010 version of jasper. Try this: source("http://bioconductor.org/biocLite.R") biocLite ('MotIV') library (MotIV); browseVignettes ('MotIV') The jaspar data in this package has 130 TF-PWM mappings, which appear to be human. More must be known, and publicly available. The JASPAR website has a 'JASPAR CORE Plantae' data set that - is probably what you are interested in - might be downloadable, and convertible to the form MotIV wants. Perhaps other readers of the list have other suggestions. If you have any questions on this, please include 'BioC' in your reply, so that we can all get better at this! - Paul On Apr 23, 2012, at 6:53 AM, nooshin wrote: > > Hi Paul, > > Many thanks for your comprehensive information and code! > I have a question regarding to extract of PWMs. How and where I can download these matrices for all TFs that PWM is available for them? I need it only for Arabidopsis thaliana. > Is there any package in R which I can give the TF and receive the PWM for it? Or any online database which I can download from it? I have a big problem since Friday to find out these matrices for different TFs of A.th. That would be so great if you can help me to get these matrices. > >> If you want to do this in bulk, Herve' has some lovely code to make that efficient. > Also can I have this? :) > > > Thanks a lot in advance. > Best regards, > Nooshin > >
Arabidopsis thaliana MotIV Arabidopsis thaliana MotIV • 3.7k views
ADD COMMENT
0
Entering edit mode
nooshin ▴ 300
@nooshin-5239
Last seen 5.4 years ago
Hi Paul, Thanks a lot. I forgot to include bioc, since I only replied to you (no to all). I can"t install MotIV package to check. I checked in google but I couldn't find any solution! Do you have any suggestion for installing this package? Bests, Nooshin On 04/23/2012 06:35 PM, Paul Shannon wrote: > (redirecting this back to the Bioc list ) > > Hi Nooshin, > > The 'bulk' approach is not quite so ready as I predicted. I might have something by the end of the week. > > As for mapping between PWMs and TFs, I have most often done this with 'tom-tom' from the meme website. > > But I just discovered what looks like a good -- maybe better -- approach: the Bioconductor MotIV package, which includes a 2010 version of jasper. > Try this: > > source("http://bioconductor.org/biocLite.R") > > biocLite ('MotIV') > library (MotIV); > browseVignettes ('MotIV') > > The jaspar data in this package has 130 TF-PWM mappings, which appear to be human. More must be known, and publicly available. The JASPAR website has a 'JASPAR CORE Plantae' data set that > - is probably what you are interested in > - might be downloadable, and convertible to the form MotIV wants. > > Perhaps other readers of the list have other suggestions. > > If you have any questions on this, please include 'BioC' in your reply, so that we can all get better at this! > > - Paul > > > On Apr 23, 2012, at 6:53 AM, nooshin wrote: > >> Hi Paul, >> >> Many thanks for your comprehensive information and code! >> I have a question regarding to extract of PWMs. How and where I can download these matrices for all TFs that PWM is available for them? I need it only for Arabidopsis thaliana. >> Is there any package in R which I can give the TF and receive the PWM for it? Or any online database which I can download from it? I have a big problem since Friday to find out these matrices for different TFs of A.th. That would be so great if you can help me to get these matrices. >> >>> If you want to do this in bulk, Herve' has some lovely code to make that efficient. >> Also can I have this? :) >> >> >> Thanks a lot in advance. >> Best regards, >> Nooshin >> >> *TODAY*/(Beta) /*•*Powered by Yahoo! Armored catfish wreak havoc in U.S. South <http: news.yahoo.com="" blogs="" sideshow="" armored-catfish-wreaking-havoc-="" south-florida-lakes-182812663.html;_ylc="X3oDMTFia2oyNjZoBF9TAzk1NDAxMD" aynwrwa2cdawqtmjizodm5narzewlka2rfzwnomgq4mgq-#more-4190=""> Privacy Policy <http: info.yahoo.com="" privacy="" us="" yahoo="" webbeacons="" details.html=""> [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
source("http://bioconductor.org/biocLite.R") biocLite("MotIV") ought to do the trick for you On Tue, Apr 24, 2012 at 1:01 AM, nooshin <n_omranian@yahoo.com> wrote: > > Hi Paul, > > Thanks a lot. > I forgot to include bioc, since I only replied to you (no to all). > > I can"t install MotIV package to check. I checked in google but I > couldn't find any solution! Do you have any suggestion for installing > this package? > > Bests, > Nooshin > > On 04/23/2012 06:35 PM, Paul Shannon wrote: > > (redirecting this back to the Bioc list ) > > > > Hi Nooshin, > > > > The 'bulk' approach is not quite so ready as I predicted. I might have > something by the end of the week. > > > > As for mapping between PWMs and TFs, I have most often done this with > 'tom-tom' from the meme website. > > > > But I just discovered what looks like a good -- maybe better -- > approach: the Bioconductor MotIV package, which includes a 2010 version of > jasper. > > Try this: > > > > source("http://bioconductor.org/biocLite.R") > > > > biocLite ('MotIV') > > library (MotIV); > > browseVignettes ('MotIV') > > > > The jaspar data in this package has 130 TF-PWM mappings, which appear to > be human. More must be known, and publicly available. The JASPAR website > has a 'JASPAR CORE Plantae' data set that > > - is probably what you are interested in > > - might be downloadable, and convertible to the form MotIV wants. > > > > Perhaps other readers of the list have other suggestions. > > > > If you have any questions on this, please include 'BioC' in your reply, > so that we can all get better at this! > > > > - Paul > > > > > > On Apr 23, 2012, at 6:53 AM, nooshin wrote: > > > >> Hi Paul, > >> > >> Many thanks for your comprehensive information and code! > >> I have a question regarding to extract of PWMs. How and where I can > download these matrices for all TFs that PWM is available for them? I need > it only for Arabidopsis thaliana. > >> Is there any package in R which I can give the TF and receive the PWM > for it? Or any online database which I can download from it? I have a big > problem since Friday to find out these matrices for different TFs of A.th. > That would be so great if you can help me to get these matrices. > >> > >>> If you want to do this in bulk, Herve' has some lovely code to make > that efficient. > >> Also can I have this? :) > >> > >> > >> Thanks a lot in advance. > >> Best regards, > >> Nooshin > >> > >> > > *TODAY*/(Beta) /*•*Powered by Yahoo! > > Armored catfish wreak havoc in U.S. South > < > http://news.yahoo.com/blogs/sideshow/armored-catfish-wreaking-havoc- south-florida-lakes-182812663.html;_ylc=X3oDMTFia2oyNjZoBF9TAzk1NDAxMD AyNwRwa2cDaWQtMjIzODM5NARzeWlkA2RfZWNoMGQ4MGQ-#more-4190 > > > > Privacy Policy > <http: info.yahoo.com="" privacy="" us="" yahoo="" webbeacons="" details.html=""> > > [[alternative HTML version deleted]] > > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > -- *A model is a lie that helps you see the truth.* * * Howard Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Thanks, it's been already solved, it needs GSL package, which is a bit problematic, but I solved it already. But it does include only 5 matrices (in the webpage) for arabidopsis and in the package also! I'm downloading manually from AthaMap! Thanks again and keep waiting for 'bulk' approach. Bests, Nooshin On 04/24/2012 03:16 PM, Tim Triche, Jr. wrote: > source("http://bioconductor.org/biocLite.R") > biocLite("MotIV") > > ought to do the trick for you > > > > On Tue, Apr 24, 2012 at 1:01 AM, nooshin <n_omranian@yahoo.com> <mailto:n_omranian@yahoo.com>> wrote: > > > Hi Paul, > > Thanks a lot. > I forgot to include bioc, since I only replied to you (no to all). > > I can"t install MotIV package to check. I checked in google but I > couldn't find any solution! Do you have any suggestion for installing > this package? > > Bests, > Nooshin > > On 04/23/2012 06:35 PM, Paul Shannon wrote: > > (redirecting this back to the Bioc list ) > > > > Hi Nooshin, > > > > The 'bulk' approach is not quite so ready as I predicted. I > might have something by the end of the week. > > > > As for mapping between PWMs and TFs, I have most often done this > with 'tom-tom' from the meme website. > > > > But I just discovered what looks like a good -- maybe better -- > approach: the Bioconductor MotIV package, which includes a 2010 > version of jasper. > > Try this: > > > > source("http://bioconductor.org/biocLite.R") > > > > biocLite ('MotIV') > > library (MotIV); > > browseVignettes ('MotIV') > > > > The jaspar data in this package has 130 TF-PWM mappings, which > appear to be human. More must be known, and publicly available. > The JASPAR website has a 'JASPAR CORE Plantae' data set that > > - is probably what you are interested in > > - might be downloadable, and convertible to the form MotIV > wants. > > > > Perhaps other readers of the list have other suggestions. > > > > If you have any questions on this, please include 'BioC' in your > reply, so that we can all get better at this! > > > > - Paul > > > > > > On Apr 23, 2012, at 6:53 AM, nooshin wrote: > > > >> Hi Paul, > >> > >> Many thanks for your comprehensive information and code! > >> I have a question regarding to extract of PWMs. How and where I > can download these matrices for all TFs that PWM is available for > them? I need it only for Arabidopsis thaliana. > >> Is there any package in R which I can give the TF and receive > the PWM for it? Or any online database which I can download from > it? I have a big problem since Friday to find out these matrices > for different TFs of A.th. That would be so great if you can help > me to get these matrices. > >> > >>> If you want to do this in bulk, Herve' has some lovely code to > make that efficient. > >> Also can I have this? :) > >> > >> > >> Thanks a lot in advance. > >> Best regards, > >> Nooshin > >> > >> > > *TODAY*/(Beta) /*•*Powered by Yahoo! > > Armored catfish wreak havoc in U.S. South > <http: news.yahoo.com="" blogs="" sideshow="" armored-catfish-wreaking-="" havoc-south-florida-lakes-182812663.html;_ylc="X3oDMTFia2oyNjZoBF9TAzk1" ndaxmdaynwrwa2cdawqtmjizodm5narzewlka2rfzwnomgq4mgq-#more-4190=""> > > Privacy Policy > <http: info.yahoo.com="" privacy="" us="" yahoo="" webbeacons="" details.html=""> > > [[alternative HTML version deleted]] > > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org <mailto:bioconductor@r-project.org> > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > -- > /A model is a lie that helps you see the truth./ > / > / > Howard Skipper > <http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Ah, I see. GSL is a useful library to have installed regardless. Hope things work out. I found your exchanges with Paul to be useful reading, but obviously I was not reading closely enough, since Paul started off his code sample with biocLite('MotIV'). Oops :-o Here is a paper that I found interesting, which does go into some detail towards a "bulk" approach, from Gottardo's group: http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.001 6432 Perhaps it will be useful to you as well, would be curious to hear if so. --t On Tue, Apr 24, 2012 at 7:00 AM, nooshin <n_omranian@yahoo.com> wrote: > ** > > Thanks, it's been already solved, it needs GSL package, which is a bit > problematic, but I solved it already. > > But it does include only 5 matrices (in the webpage) for arabidopsis and > in the package also! > I'm downloading manually from AthaMap! > > Thanks again and keep waiting for 'bulk' approach. > > Bests, > Nooshin > > > On 04/24/2012 03:16 PM, Tim Triche, Jr. wrote: > > source("http://bioconductor.org/biocLite.R") > biocLite("MotIV") > > ought to do the trick for you > > > > On Tue, Apr 24, 2012 at 1:01 AM, nooshin <n_omranian@yahoo.com> wrote: > >> >> Hi Paul, >> >> Thanks a lot. >> I forgot to include bioc, since I only replied to you (no to all). >> >> I can"t install MotIV package to check. I checked in google but I >> couldn't find any solution! Do you have any suggestion for installing >> this package? >> >> Bests, >> Nooshin >> >> On 04/23/2012 06:35 PM, Paul Shannon wrote: >> > (redirecting this back to the Bioc list ) >> > >> > Hi Nooshin, >> > >> > The 'bulk' approach is not quite so ready as I predicted. I might have >> something by the end of the week. >> > >> > As for mapping between PWMs and TFs, I have most often done this with >> 'tom-tom' from the meme website. >> > >> > But I just discovered what looks like a good -- maybe better -- >> approach: the Bioconductor MotIV package, which includes a 2010 version of >> jasper. >> > Try this: >> > >> > source("http://bioconductor.org/biocLite.R") >> > >> > biocLite ('MotIV') >> > library (MotIV); >> > browseVignettes ('MotIV') >> > >> > The jaspar data in this package has 130 TF-PWM mappings, which appear >> to be human. More must be known, and publicly available. The JASPAR >> website has a 'JASPAR CORE Plantae' data set that >> > - is probably what you are interested in >> > - might be downloadable, and convertible to the form MotIV wants. >> > >> > Perhaps other readers of the list have other suggestions. >> > >> > If you have any questions on this, please include 'BioC' in your reply, >> so that we can all get better at this! >> > >> > - Paul >> > >> > >> > On Apr 23, 2012, at 6:53 AM, nooshin wrote: >> > >> >> Hi Paul, >> >> >> >> Many thanks for your comprehensive information and code! >> >> I have a question regarding to extract of PWMs. How and where I can >> download these matrices for all TFs that PWM is available for them? I need >> it only for Arabidopsis thaliana. >> >> Is there any package in R which I can give the TF and receive the PWM >> for it? Or any online database which I can download from it? I have a big >> problem since Friday to find out these matrices for different TFs of A.th. >> That would be so great if you can help me to get these matrices. >> >> >> >>> If you want to do this in bulk, Herve' has some lovely code to make >> that efficient. >> >> Also can I have this? :) >> >> >> >> >> >> Thanks a lot in advance. >> >> Best regards, >> >> Nooshin >> >> >> >> >> >> *TODAY*/(Beta) /*•*Powered by Yahoo! >> >> Armored catfish wreak havoc in U.S. South >> < >> http://news.yahoo.com/blogs/sideshow/armored-catfish-wreaking- havoc-south-florida-lakes-182812663.html;_ylc=X3oDMTFia2oyNjZoBF9TAzk1 NDAxMDAyNwRwa2cDaWQtMjIzODM5NARzeWlkA2RfZWNoMGQ4MGQ-#more-4190 >> > >> >> Privacy Policy >> <http: info.yahoo.com="" privacy="" us="" yahoo="" webbeacons="" details.html=""> >> >> [[alternative HTML version deleted]] >> >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> > > > -- > *A model is a lie that helps you see the truth.* > * > * > Howard Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> > > > -- *A model is a lie that helps you see the truth.* * * Howard Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Thanks a lot for your suggestion. I will for sure have a look and inform you. Bests, Nooshin On 04/24/2012 04:15 PM, Tim Triche, Jr. wrote: > Ah, I see. GSL is a useful library to have installed regardless. > Hope things work out. I found your exchanges with Paul to be useful > reading, but obviously I was not reading closely enough, since Paul > started off his code sample with biocLite('MotIV'). Oops :-o > > Here is a paper that I found interesting, which does go into some > detail towards a "bulk" approach, from Gottardo's group: > > http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0 016432 > > Perhaps it will be useful to you as well, would be curious to hear if so. > > --t > > On Tue, Apr 24, 2012 at 7:00 AM, nooshin <n_omranian@yahoo.com> <mailto:n_omranian@yahoo.com>> wrote: > > > Thanks, it's been already solved, it needs GSL package, which is a > bit problematic, but I solved it already. > > But it does include only 5 matrices (in the webpage) for > arabidopsis and in the package also! > I'm downloading manually from AthaMap! > > Thanks again and keep waiting for 'bulk' approach. > > Bests, > Nooshin > > > On 04/24/2012 03:16 PM, Tim Triche, Jr. wrote: >> source("http://bioconductor.org/biocLite.R") >> biocLite("MotIV") >> >> ought to do the trick for you >> >> >> >> On Tue, Apr 24, 2012 at 1:01 AM, nooshin <n_omranian@yahoo.com>> <mailto:n_omranian@yahoo.com>> wrote: >> >> >> Hi Paul, >> >> Thanks a lot. >> I forgot to include bioc, since I only replied to you (no to >> all). >> >> I can"t install MotIV package to check. I checked in google but I >> couldn't find any solution! Do you have any suggestion for >> installing >> this package? >> >> Bests, >> Nooshin >> >> On 04/23/2012 06:35 PM, Paul Shannon wrote: >> > (redirecting this back to the Bioc list ) >> > >> > Hi Nooshin, >> > >> > The 'bulk' approach is not quite so ready as I predicted. >> I might have something by the end of the week. >> > >> > As for mapping between PWMs and TFs, I have most often done >> this with 'tom-tom' from the meme website. >> > >> > But I just discovered what looks like a good -- maybe >> better -- approach: the Bioconductor MotIV package, which >> includes a 2010 version of jasper. >> > Try this: >> > >> > source("http://bioconductor.org/biocLite.R") >> > >> > biocLite ('MotIV') >> > library (MotIV); >> > browseVignettes ('MotIV') >> > >> > The jaspar data in this package has 130 TF-PWM mappings, >> which appear to be human. More must be known, and publicly >> available. The JASPAR website has a 'JASPAR CORE Plantae' >> data set that >> > - is probably what you are interested in >> > - might be downloadable, and convertible to the form >> MotIV wants. >> > >> > Perhaps other readers of the list have other suggestions. >> > >> > If you have any questions on this, please include 'BioC' in >> your reply, so that we can all get better at this! >> > >> > - Paul >> > >> > >> > On Apr 23, 2012, at 6:53 AM, nooshin wrote: >> > >> >> Hi Paul, >> >> >> >> Many thanks for your comprehensive information and code! >> >> I have a question regarding to extract of PWMs. How and >> where I can download these matrices for all TFs that PWM is >> available for them? I need it only for Arabidopsis thaliana. >> >> Is there any package in R which I can give the TF and >> receive the PWM for it? Or any online database which I can >> download from it? I have a big problem since Friday to find >> out these matrices for different TFs of A.th. That would be >> so great if you can help me to get these matrices. >> >> >> >>> If you want to do this in bulk, Herve' has some lovely >> code to make that efficient. >> >> Also can I have this? :) >> >> >> >> >> >> Thanks a lot in advance. >> >> Best regards, >> >> Nooshin >> >> >> >> >> >> *TODAY*/(Beta) /*•*Powered by Yahoo! >> >> Armored catfish wreak havoc in U.S. South >> <http: news.yahoo.com="" blogs="" sideshow="" armored-catfish-="" wreaking-havoc-south-florida-lakes-182812663.html;_ylc="X3oDMTFia2oyNjZ" obf9tazk1ndaxmdaynwrwa2cdawqtmjizodm5narzewlka2rfzwnomgq4mgq-#more-419="" 0=""> >> >> Privacy Policy >> <http: info.yahoo.com="" privacy="" us="" yahoo="" webbeacons="" details.html=""> >> >> [[alternative HTML version deleted]] >> >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org <mailto:bioconductor@r-project.org> >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> >> >> >> -- >> /A model is a lie that helps you see the truth./ >> / >> / >> Howard Skipper >> <http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> >> > > > > > -- > /A model is a lie that helps you see the truth./ > / > / > Howard Skipper > <http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hello, I am one of the developer of MotIV. I will be happy to help you if you have any question regarding the package. First, I want to mention that in the Plos One paper, we used PICS, rGADEM and MotIV as a pipeline but MotIV can be use as a stand alone. Some of the advanced functions won't be available though. Since the PWMs in MotIV correspond to human TF, you may have to use your own list of PWMs. What MotIV needs is a simple list of matrices (head(jaspar) to view the format). Jaspar's PWMs can be easily downloaded but it seems it only contains ~20 motifs. On the other hand, AthaMap has more motifs but I did not manage to find an easy way to get them. Another place to look at is the AGRIS website (http://arabidopsis.med.ohio-state.edu/downloads.html). If you're only interested by the identification of the motifs and do not want to do further analysis with R, I recommend you to look at http://www.benoslab.pitt.edu/stamp for the identification of your motifs. Regards, Eloi Mercier On 12-04-24 07:36 AM, nooshin wrote: > Thanks a lot for your suggestion. I will for sure have a look and inform > you. > Bests, > Nooshin > > > On 04/24/2012 04:15 PM, Tim Triche, Jr. wrote: >> Ah, I see. GSL is a useful library to have installed regardless. >> Hope things work out. I found your exchanges with Paul to be useful >> reading, but obviously I was not reading closely enough, since Paul >> started off his code sample with biocLite('MotIV'). Oops :-o >> >> Here is a paper that I found interesting, which does go into some >> detail towards a "bulk" approach, from Gottardo's group: >> >> http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone. 0016432 >> Perhaps it will be useful to you as well, would be curious to hear if so. >> >> --t >> >> On Tue, Apr 24, 2012 at 7:00 AM, nooshin<n_omranian@yahoo.com>> <mailto:n_omranian@yahoo.com>> wrote: >> >> >> Thanks, it's been already solved, it needs GSL package, which is a >> bit problematic, but I solved it already. >> >> But it does include only 5 matrices (in the webpage) for >> arabidopsis and in the package also! >> I'm downloading manually from AthaMap! >> >> Thanks again and keep waiting for 'bulk' approach. >> >> Bests, >> Nooshin >> >> >> On 04/24/2012 03:16 PM, Tim Triche, Jr. wrote: >>> source("http://bioconductor.org/biocLite.R") >>> biocLite("MotIV") >>> >>> ought to do the trick for you >>> >>> >>> >>> On Tue, Apr 24, 2012 at 1:01 AM, nooshin<n_omranian@yahoo.com>>> <mailto:n_omranian@yahoo.com>> wrote: >>> >>> >>> Hi Paul, >>> >>> Thanks a lot. >>> I forgot to include bioc, since I only replied to you (no to >>> all). >>> >>> I can"t install MotIV package to check. I checked in google but I >>> couldn't find any solution! Do you have any suggestion for >>> installing >>> this package? >>> >>> Bests, >>> Nooshin >>> >>> On 04/23/2012 06:35 PM, Paul Shannon wrote: >>> > (redirecting this back to the Bioc list...) >>> > >>> > Hi Nooshin, >>> > >>> > The 'bulk' approach is not quite so ready as I predicted. >>> I might have something by the end of the week. >>> > >>> > As for mapping between PWMs and TFs, I have most often done >>> this with 'tom-tom' from the meme website. >>> > >>> > But I just discovered what looks like a good -- maybe >>> better -- approach: the Bioconductor MotIV package, which >>> includes a 2010 version of jasper. >>> > Try this: >>> > >>> > source("http://bioconductor.org/biocLite.R") >>> > >>> > biocLite ('MotIV') >>> > library (MotIV); >>> > browseVignettes ('MotIV') >>> > >>> > The jaspar data in this package has 130 TF-PWM mappings, >>> which appear to be human. More must be known, and publicly >>> available. The JASPAR website has a 'JASPAR CORE Plantae' >>> data set that >>> > - is probably what you are interested in >>> > - might be downloadable, and convertible to the form >>> MotIV wants. >>> > >>> > Perhaps other readers of the list have other suggestions. >>> > >>> > If you have any questions on this, please include 'BioC' in >>> your reply, so that we can all get better at this! >>> > >>> > - Paul >>> > >>> > >>> > On Apr 23, 2012, at 6:53 AM, nooshin wrote: >>> > >>> >> Hi Paul, >>> >> >>> >> Many thanks for your comprehensive information and code! >>> >> I have a question regarding to extract of PWMs. How and >>> where I can download these matrices for all TFs that PWM is >>> available for them? I need it only for Arabidopsis thaliana. >>> >> Is there any package in R which I can give the TF and >>> receive the PWM for it? Or any online database which I can >>> download from it? I have a big problem since Friday to find >>> out these matrices for different TFs of A.th. That would be >>> so great if you can help me to get these matrices. >>> >> >>> >>> If you want to do this in bulk, Herve' has some lovely >>> code to make that efficient. >>> >> Also can I have this? :) >>> >> >>> >> >>> >> Thanks a lot in advance. >>> >> Best regards, >>> >> Nooshin >>> >> >>> >> >>> >>> *TODAY*/(Beta) /*.*Powered by Yahoo! >>> >>> Armored catfish wreak havoc in U.S. South >>> <http: news.yahoo.com="" blogs="" sideshow="" armored-catfish-="" wreaking-havoc-south-florida-lakes-182812663.html;_ylc="X3oDMTFia2oyNjZ" obf9tazk1ndaxmdaynwrwa2cdawqtmjizodm5narzewlka2rfzwnomgq4mgq-#more-419="" 0=""> >>> >>> Privacy Policy >>> <http: info.yahoo.com="" privacy="" us="" yahoo="" webbeacons="" details.html=""> >>> >>> [[alternative HTML version deleted]] >>> >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor@r-project.org<mailto:bioconductor@r-project.org> >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >>> >>> >>> >>> -- >>> /A model is a lie that helps you see the truth./ >>> / >>> / >>> Howard Skipper >>> <http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> >>> >> >> >> >> -- >> /A model is a lie that helps you see the truth./ >> / >> / >> Howard Skipper >> <http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> >> > > [[alternative HTML version deleted]] > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Eloi Mercier Bioinformatics PhD Student, UBC Paul Pavlidis Lab 2185 East Mall University of British Columbia Vancouver BC V6T1Z4 [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Eloi, I would like to use MotIV for a c.elegans dataset. What data source would you recommend for matchMotif? Many thanks for your help! Best regards, Julie On 4/24/12 1:28 PM, "Mercier Eloi" <emercier at="" chibi.ubc.ca=""> wrote: > Hello, > > I am one of the developer of MotIV. I will be happy to help you if you > have any question regarding the package. > > First, I want to mention that in the Plos One paper, we used PICS, > rGADEM and MotIV as a pipeline but MotIV can be use as a stand alone. > Some of the advanced functions won't be available though. > > Since the PWMs in MotIV correspond to human TF, you may have to use your > own list of PWMs. What MotIV needs is a simple list of matrices > (head(jaspar) to view the format). > Jaspar's PWMs can be easily downloaded but it seems it only contains ~20 > motifs. On the other hand, AthaMap has more motifs but I did not manage > to find an easy way to get them. Another place to look at is the AGRIS > website (http://arabidopsis.med.ohio-state.edu/downloads.html). > > If you're only interested by the identification of the motifs and do not > want to do further analysis with R, I recommend you to look at > http://www.benoslab.pitt.edu/stamp for the identification of your motifs. > > Regards, > > Eloi Mercier > > > On 12-04-24 07:36 AM, nooshin wrote: >> Thanks a lot for your suggestion. I will for sure have a look and inform >> you. >> Bests, >> Nooshin >> >> >> On 04/24/2012 04:15 PM, Tim Triche, Jr. wrote: >>> Ah, I see. GSL is a useful library to have installed regardless. >>> Hope things work out. I found your exchanges with Paul to be useful >>> reading, but obviously I was not reading closely enough, since Paul >>> started off his code sample with biocLite('MotIV'). Oops :-o >>> >>> Here is a paper that I found interesting, which does go into some >>> detail towards a "bulk" approach, from Gottardo's group: >>> >>> http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone .0016432 > >>> Perhaps it will be useful to you as well, would be curious to hear if so. >>> >>> --t >>> >>> On Tue, Apr 24, 2012 at 7:00 AM, nooshin<n_omranian at="" yahoo.com="">>> <mailto:n_omranian at="" yahoo.com="">> wrote: >>> >>> >>> Thanks, it's been already solved, it needs GSL package, which is a >>> bit problematic, but I solved it already. >>> >>> But it does include only 5 matrices (in the webpage) for >>> arabidopsis and in the package also! >>> I'm downloading manually from AthaMap! >>> >>> Thanks again and keep waiting for 'bulk' approach. >>> >>> Bests, >>> Nooshin >>> >>> >>> On 04/24/2012 03:16 PM, Tim Triche, Jr. wrote: >>>> source("http://bioconductor.org/biocLite.R") >>>> biocLite("MotIV") >>>> >>>> ought to do the trick for you >>>> >>>> >>>> >>>> On Tue, Apr 24, 2012 at 1:01 AM, nooshin<n_omranian at="" yahoo.com="">>>> <mailto:n_omranian at="" yahoo.com="">> wrote: >>>> >>>> >>>> Hi Paul, >>>> >>>> Thanks a lot. >>>> I forgot to include bioc, since I only replied to you (no to >>>> all). >>>> >>>> I can"t install MotIV package to check. I checked in google but I >>>> couldn't find any solution! Do you have any suggestion for >>>> installing >>>> this package? >>>> >>>> Bests, >>>> Nooshin >>>> >>>> On 04/23/2012 06:35 PM, Paul Shannon wrote: >>>>> (redirecting this back to the Bioc list...) >>>>> >>>>> Hi Nooshin, >>>>> >>>>> The 'bulk' approach is not quite so ready as I predicted. >>>> I might have something by the end of the week. >>>>> >>>>> As for mapping between PWMs and TFs, I have most often done >>>> this with 'tom-tom' from the meme website. >>>>> >>>>> But I just discovered what looks like a good -- maybe >>>> better -- approach: the Bioconductor MotIV package, which >>>> includes a 2010 version of jasper. >>>>> Try this: >>>>> >>>>> source("http://bioconductor.org/biocLite.R") >>>>> >>>>> biocLite ('MotIV') >>>>> library (MotIV); >>>>> browseVignettes ('MotIV') >>>>> >>>>> The jaspar data in this package has 130 TF-PWM mappings, >>>> which appear to be human. More must be known, and publicly >>>> available. The JASPAR website has a 'JASPAR CORE Plantae' >>>> data set that >>>>> - is probably what you are interested in >>>>> - might be downloadable, and convertible to the form >>>> MotIV wants. >>>>> >>>>> Perhaps other readers of the list have other suggestions. >>>>> >>>>> If you have any questions on this, please include 'BioC' in >>>> your reply, so that we can all get better at this! >>>>> >>>>> - Paul >>>>> >>>>> >>>>> On Apr 23, 2012, at 6:53 AM, nooshin wrote: >>>>> >>>>>> Hi Paul, >>>>>> >>>>>> Many thanks for your comprehensive information and code! >>>>>> I have a question regarding to extract of PWMs. How and >>>> where I can download these matrices for all TFs that PWM is >>>> available for them? I need it only for Arabidopsis thaliana. >>>>>> Is there any package in R which I can give the TF and >>>> receive the PWM for it? Or any online database which I can >>>> download from it? I have a big problem since Friday to find >>>> out these matrices for different TFs of A.th. That would be >>>> so great if you can help me to get these matrices. >>>>>> >>>>>>> If you want to do this in bulk, Herve' has some lovely >>>> code to make that efficient. >>>>>> Also can I have this? :) >>>>>> >>>>>> >>>>>> Thanks a lot in advance. >>>>>> Best regards, >>>>>> Nooshin >>>>>> >>>>>> >>>> >>>> *TODAY*/(Beta) /*.*Powered by Yahoo! >>>> >>>> Armored catfish wreak havoc in U.S. South >>>> >>>> <http: news.yahoo.com="" blogs="" sideshow="" armored-catfish-wreaking-="" havoc-south-="">>>> florida- lakes-182812663.html;_ylc=X3oDMTFia2oyNjZoBF9TAzk1NDAxMDAyNwRwa2cDa >>>> WQtMjIzODM5NARzeWlkA2RfZWNoMGQ4MGQ-#more-4190> >>>> >>>> Privacy Policy >>>> <http: info.yahoo.com="" privacy="" us="" yahoo="" webbeacons="" details.html=""> >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at r-project.org<mailto:bioconductor at="" r-project.org=""> >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>> >>>> >>>> >>>> >>>> -- >>>> /A model is a lie that helps you see the truth./ >>>> / >>>> / >>>> Howard Skipper >>>> <http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> >>>> >>> >>> >>> >>> -- >>> /A model is a lie that helps you see the truth./ >>> / >>> / >>> Howard Skipper >>> <http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> >>> >> >> [[alternative HTML version deleted]] >> >> >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY
0
Entering edit mode
The recent flurry of interest in sequence motifs here on the bioc list suggests to us that maybe we at Bioconductor could strengthen our infrastructure for this kind of work. If this work interests you -- either as a package creator, or as a package user -- please suggest ideas or use cases. What do you need? I will collect and collate the responses. We hope to identify places where Bioc can help out. For background: we already have a number of packages (rGADEM, MotIV, cosmo, BCRANK, motifRG) which address, with different strengths, what I believe to be the two aspects of the motif problem: 1) Detecting enriched motifs in DNA sequence, or in ChIP-seq data (rGADEM, cosmo, motifRG, BCRANK) 2) Predicting the sequence motifs which bind to these enriched motifs, and what binding molecules they belong to (MotIV) In the past, a lot of sequence motif/binding work has addressed the search for transcription factor binding sites and their cognate transcription factors. miRNAs, phorphorylation and methylation all pose related problems. Is there support which we can practically offer here as well? In addition to Bioc packages, there are of course many worthwhile websites and external tools: JASPAR, meme, STAMP (and TRANSFAC, for those with a license). Nooshin mentioned the arabidopsis-specific 'AthaMap' (http://www.athamap.de). Are there other open-source data repositories like this for other organisms? c.elegans, as Julie requested? Questions, suggestions, use cases and data sources are all welcome. Thanks! - Paul On Apr 24, 2012, at 10:47 AM, Zhu, Lihua (Julie) wrote: > Eloi, > > I would like to use MotIV for a c.elegans dataset. What data source would > you recommend for matchMotif? Many thanks for your help! > > Best regards, > > Julie > > > On 4/24/12 1:28 PM, "Mercier Eloi" <emercier at="" chibi.ubc.ca=""> wrote: > >> Hello, >> >> I am one of the developer of MotIV. I will be happy to help you if you >> have any question regarding the package. >> >> First, I want to mention that in the Plos One paper, we used PICS, >> rGADEM and MotIV as a pipeline but MotIV can be use as a stand alone. >> Some of the advanced functions won't be available though. >> >> Since the PWMs in MotIV correspond to human TF, you may have to use your >> own list of PWMs. What MotIV needs is a simple list of matrices >> (head(jaspar) to view the format). >> Jaspar's PWMs can be easily downloaded but it seems it only contains ~20 >> motifs. On the other hand, AthaMap has more motifs but I did not manage >> to find an easy way to get them. Another place to look at is the AGRIS >> website (http://arabidopsis.med.ohio-state.edu/downloads.html). >> >> If you're only interested by the identification of the motifs and do not >> want to do further analysis with R, I recommend you to look at >> http://www.benoslab.pitt.edu/stamp for the identification of your motifs. >> >> Regards, >> >> Eloi Mercier >> >> >> On 12-04-24 07:36 AM, nooshin wrote: >>> Thanks a lot for your suggestion. I will for sure have a look and inform >>> you. >>> Bests, >>> Nooshin >>> >>> >>> On 04/24/2012 04:15 PM, Tim Triche, Jr. wrote: >>>> Ah, I see. GSL is a useful library to have installed regardless. >>>> Hope things work out. I found your exchanges with Paul to be useful >>>> reading, but obviously I was not reading closely enough, since Paul >>>> started off his code sample with biocLite('MotIV'). Oops :-o >>>> >>>> Here is a paper that I found interesting, which does go into some >>>> detail towards a "bulk" approach, from Gottardo's group: >>>> >>>> http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pon e.0016432 >> >>>> Perhaps it will be useful to you as well, would be curious to hear if so. >>>> >>>> --t >>>> >>>> On Tue, Apr 24, 2012 at 7:00 AM, nooshin<n_omranian at="" yahoo.com="">>>> <mailto:n_omranian at="" yahoo.com="">> wrote: >>>> >>>> >>>> Thanks, it's been already solved, it needs GSL package, which is a >>>> bit problematic, but I solved it already. >>>> >>>> But it does include only 5 matrices (in the webpage) for >>>> arabidopsis and in the package also! >>>> I'm downloading manually from AthaMap! >>>> >>>> Thanks again and keep waiting for 'bulk' approach. >>>> >>>> Bests, >>>> Nooshin >>>> >>>> >>>> On 04/24/2012 03:16 PM, Tim Triche, Jr. wrote: >>>>> source("http://bioconductor.org/biocLite.R") >>>>> biocLite("MotIV") >>>>> >>>>> ought to do the trick for you >>>>> >>>>> >>>>> >>>>> On Tue, Apr 24, 2012 at 1:01 AM, nooshin<n_omranian at="" yahoo.com="">>>>> <mailto:n_omranian at="" yahoo.com="">> wrote: >>>>> >>>>> >>>>> Hi Paul, >>>>> >>>>> Thanks a lot. >>>>> I forgot to include bioc, since I only replied to you (no to >>>>> all). >>>>> >>>>> I can"t install MotIV package to check. I checked in google but I >>>>> couldn't find any solution! Do you have any suggestion for >>>>> installing >>>>> this package? >>>>> >>>>> Bests, >>>>> Nooshin >>>>> >>>>> On 04/23/2012 06:35 PM, Paul Shannon wrote: >>>>>> (redirecting this back to the Bioc list...) >>>>>> >>>>>> Hi Nooshin, >>>>>> >>>>>> The 'bulk' approach is not quite so ready as I predicted. >>>>> I might have something by the end of the week. >>>>>> >>>>>> As for mapping between PWMs and TFs, I have most often done >>>>> this with 'tom-tom' from the meme website. >>>>>> >>>>>> But I just discovered what looks like a good -- maybe >>>>> better -- approach: the Bioconductor MotIV package, which >>>>> includes a 2010 version of jasper. >>>>>> Try this: >>>>>> >>>>>> source("http://bioconductor.org/biocLite.R") >>>>>> >>>>>> biocLite ('MotIV') >>>>>> library (MotIV); >>>>>> browseVignettes ('MotIV') >>>>>> >>>>>> The jaspar data in this package has 130 TF-PWM mappings, >>>>> which appear to be human. More must be known, and publicly >>>>> available. The JASPAR website has a 'JASPAR CORE Plantae' >>>>> data set that >>>>>> - is probably what you are interested in >>>>>> - might be downloadable, and convertible to the form >>>>> MotIV wants. >>>>>> >>>>>> Perhaps other readers of the list have other suggestions. >>>>>> >>>>>> If you have any questions on this, please include 'BioC' in >>>>> your reply, so that we can all get better at this! >>>>>> >>>>>> - Paul >>>>>> >>>>>> >>>>>> On Apr 23, 2012, at 6:53 AM, nooshin wrote: >>>>>> >>>>>>> Hi Paul, >>>>>>> >>>>>>> Many thanks for your comprehensive information and code! >>>>>>> I have a question regarding to extract of PWMs. How and >>>>> where I can download these matrices for all TFs that PWM is >>>>> available for them? I need it only for Arabidopsis thaliana. >>>>>>> Is there any package in R which I can give the TF and >>>>> receive the PWM for it? Or any online database which I can >>>>> download from it? I have a big problem since Friday to find >>>>> out these matrices for different TFs of A.th. That would be >>>>> so great if you can help me to get these matrices. >>>>>>> >>>>>>>> If you want to do this in bulk, Herve' has some lovely >>>>> code to make that efficient. >>>>>>> Also can I have this? :) >>>>>>> >>>>>>> >>>>>>> Thanks a lot in advance. >>>>>>> Best regards, >>>>>>> Nooshin >>>>>>> >>>>>>> >>>>> >>>>> *TODAY*/(Beta) /*.*Powered by Yahoo! >>>>> >>>>> Armored catfish wreak havoc in U.S. South >>>>> >>>>> <http: news.yahoo.com="" blogs="" sideshow="" armored-catfish-wreaking-="" havoc-south-="">>>>> florida- lakes-182812663.html;_ylc=X3oDMTFia2oyNjZoBF9TAzk1NDAxMDAyNwRwa2cDa >>>>> WQtMjIzODM5NARzeWlkA2RfZWNoMGQ4MGQ-#more-4190> >>>>> >>>>> Privacy Policy >>>>> <http: info.yahoo.com="" privacy="" us="" yahoo="" webbeacons="" details.html=""> >>>>> >>>>> [[alternative HTML version deleted]] >>>>> >>>>> >>>>> _______________________________________________ >>>>> Bioconductor mailing list >>>>> Bioconductor at r-project.org<mailto:bioconductor at="" r-project.org=""> >>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>> Search the archives: >>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> /A model is a lie that helps you see the truth./ >>>>> / >>>>> / >>>>> Howard Skipper >>>>> <http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> >>>>> >>>> >>>> >>>> >>>> -- >>>> /A model is a lie that helps you see the truth./ >>>> / >>>> / >>>> Howard Skipper >>>> <http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> >>>> >>> >>> [[alternative HTML version deleted]] >>> >>> >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
Paul, Thanks so much for the comprehensive summary of existing capability of Bioc and other resources for motif discovery and matching! Here is my response to your great initiative to collect use cases and open data resources. Here is an open data source for Drosophila which we developed: http://pgfe.umassmed.edu/TFDBS/ http://nar.oxfordjournals.org/content/early/2010/11/19/nar.gkq858.full As you pointed out, there are several excellent Bioconductor packages available for the two common cases of motif problems, i.e., de nova motif discovery and motif matching to known motifs. It would be useful to have more motif databases available for motif comparison program such as MotIV. In addition, we use clover to search for known motifs in a given set of sequences. Many thanks for sharing your insights! Best regards, Julie On 4/24/12 3:02 PM, "Paul Shannon" <pshannon at="" fhcrc.org=""> wrote: > The recent flurry of interest in sequence motifs here on the bioc list > suggests to us that maybe we at Bioconductor could strengthen our > infrastructure for this kind of work. If this work interests you -- either as > a package creator, or as a package user -- please suggest ideas or use cases. > What do you need? I will collect and collate the responses. We hope to > identify places where Bioc can help out. > > For background: we already have a number of packages (rGADEM, MotIV, cosmo, > BCRANK, motifRG) which address, with different strengths, what I believe to be > the two aspects of the motif problem: > > 1) Detecting enriched motifs in DNA sequence, or in ChIP-seq data (rGADEM, > cosmo, motifRG, BCRANK) > 2) Predicting the sequence motifs which bind to these enriched motifs, and > what binding molecules they belong to (MotIV) > > In the past, a lot of sequence motif/binding work has addressed the search for > transcription factor binding sites and their cognate transcription factors. > miRNAs, phorphorylation and methylation all pose related problems. Is there > support which we can practically offer here as well? > > In addition to Bioc packages, there are of course many worthwhile websites and > external tools: JASPAR, meme, STAMP (and TRANSFAC, for those with a license). > Nooshin mentioned the arabidopsis-specific 'AthaMap' (http://www.athamap.de). > Are there other open-source data repositories like this for other organisms? > c.elegans, as Julie requested? > > Questions, suggestions, use cases and data sources are all welcome. > > Thanks! > > - Paul > > > > > On Apr 24, 2012, at 10:47 AM, Zhu, Lihua (Julie) wrote: > >> Eloi, >> >> I would like to use MotIV for a c.elegans dataset. What data source would >> you recommend for matchMotif? Many thanks for your help! >> >> Best regards, >> >> Julie >> >> >> On 4/24/12 1:28 PM, "Mercier Eloi" <emercier at="" chibi.ubc.ca=""> wrote: >> >>> Hello, >>> >>> I am one of the developer of MotIV. I will be happy to help you if you >>> have any question regarding the package. >>> >>> First, I want to mention that in the Plos One paper, we used PICS, >>> rGADEM and MotIV as a pipeline but MotIV can be use as a stand alone. >>> Some of the advanced functions won't be available though. >>> >>> Since the PWMs in MotIV correspond to human TF, you may have to use your >>> own list of PWMs. What MotIV needs is a simple list of matrices >>> (head(jaspar) to view the format). >>> Jaspar's PWMs can be easily downloaded but it seems it only contains ~20 >>> motifs. On the other hand, AthaMap has more motifs but I did not manage >>> to find an easy way to get them. Another place to look at is the AGRIS >>> website (http://arabidopsis.med.ohio-state.edu/downloads.html). >>> >>> If you're only interested by the identification of the motifs and do not >>> want to do further analysis with R, I recommend you to look at >>> http://www.benoslab.pitt.edu/stamp for the identification of your motifs. >>> >>> Regards, >>> >>> Eloi Mercier >>> >>> >>> On 12-04-24 07:36 AM, nooshin wrote: >>>> Thanks a lot for your suggestion. I will for sure have a look and inform >>>> you. >>>> Bests, >>>> Nooshin >>>> >>>> >>>> On 04/24/2012 04:15 PM, Tim Triche, Jr. wrote: >>>>> Ah, I see. GSL is a useful library to have installed regardless. >>>>> Hope things work out. I found your exchanges with Paul to be useful >>>>> reading, but obviously I was not reading closely enough, since Paul >>>>> started off his code sample with biocLite('MotIV'). Oops :-o >>>>> >>>>> Here is a paper that I found interesting, which does go into some >>>>> detail towards a "bulk" approach, from Gottardo's group: >>>>> >>>>> http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.po ne.0016432 >>> >>>>> Perhaps it will be useful to you as well, would be curious to hear if so. >>>>> >>>>> --t >>>>> >>>>> On Tue, Apr 24, 2012 at 7:00 AM, nooshin<n_omranian at="" yahoo.com="">>>>> <mailto:n_omranian at="" yahoo.com="">> wrote: >>>>> >>>>> >>>>> Thanks, it's been already solved, it needs GSL package, which is a >>>>> bit problematic, but I solved it already. >>>>> >>>>> But it does include only 5 matrices (in the webpage) for >>>>> arabidopsis and in the package also! >>>>> I'm downloading manually from AthaMap! >>>>> >>>>> Thanks again and keep waiting for 'bulk' approach. >>>>> >>>>> Bests, >>>>> Nooshin >>>>> >>>>> >>>>> On 04/24/2012 03:16 PM, Tim Triche, Jr. wrote: >>>>>> source("http://bioconductor.org/biocLite.R") >>>>>> biocLite("MotIV") >>>>>> >>>>>> ought to do the trick for you >>>>>> >>>>>> >>>>>> >>>>>> On Tue, Apr 24, 2012 at 1:01 AM, nooshin<n_omranian at="" yahoo.com="">>>>>> <mailto:n_omranian at="" yahoo.com="">> wrote: >>>>>> >>>>>> >>>>>> Hi Paul, >>>>>> >>>>>> Thanks a lot. >>>>>> I forgot to include bioc, since I only replied to you (no to >>>>>> all). >>>>>> >>>>>> I can"t install MotIV package to check. I checked in google but I >>>>>> couldn't find any solution! Do you have any suggestion for >>>>>> installing >>>>>> this package? >>>>>> >>>>>> Bests, >>>>>> Nooshin >>>>>> >>>>>> On 04/23/2012 06:35 PM, Paul Shannon wrote: >>>>>>> (redirecting this back to the Bioc list...) >>>>>>> >>>>>>> Hi Nooshin, >>>>>>> >>>>>>> The 'bulk' approach is not quite so ready as I predicted. >>>>>> I might have something by the end of the week. >>>>>>> >>>>>>> As for mapping between PWMs and TFs, I have most often done >>>>>> this with 'tom-tom' from the meme website. >>>>>>> >>>>>>> But I just discovered what looks like a good -- maybe >>>>>> better -- approach: the Bioconductor MotIV package, which >>>>>> includes a 2010 version of jasper. >>>>>>> Try this: >>>>>>> >>>>>>> source("http://bioconductor.org/biocLite.R") >>>>>>> >>>>>>> biocLite ('MotIV') >>>>>>> library (MotIV); >>>>>>> browseVignettes ('MotIV') >>>>>>> >>>>>>> The jaspar data in this package has 130 TF-PWM mappings, >>>>>> which appear to be human. More must be known, and publicly >>>>>> available. The JASPAR website has a 'JASPAR CORE Plantae' >>>>>> data set that >>>>>>> - is probably what you are interested in >>>>>>> - might be downloadable, and convertible to the form >>>>>> MotIV wants. >>>>>>> >>>>>>> Perhaps other readers of the list have other suggestions. >>>>>>> >>>>>>> If you have any questions on this, please include 'BioC' in >>>>>> your reply, so that we can all get better at this! >>>>>>> >>>>>>> - Paul >>>>>>> >>>>>>> >>>>>>> On Apr 23, 2012, at 6:53 AM, nooshin wrote: >>>>>>> >>>>>>>> Hi Paul, >>>>>>>> >>>>>>>> Many thanks for your comprehensive information and code! >>>>>>>> I have a question regarding to extract of PWMs. How and >>>>>> where I can download these matrices for all TFs that PWM is >>>>>> available for them? I need it only for Arabidopsis thaliana. >>>>>>>> Is there any package in R which I can give the TF and >>>>>> receive the PWM for it? Or any online database which I can >>>>>> download from it? I have a big problem since Friday to find >>>>>> out these matrices for different TFs of A.th. That would be >>>>>> so great if you can help me to get these matrices. >>>>>>>> >>>>>>>>> If you want to do this in bulk, Herve' has some lovely >>>>>> code to make that efficient. >>>>>>>> Also can I have this? :) >>>>>>>> >>>>>>>> >>>>>>>> Thanks a lot in advance. >>>>>>>> Best regards, >>>>>>>> Nooshin >>>>>>>> >>>>>>>> >>>>>> >>>>>> *TODAY*/(Beta) /*.*Powered by Yahoo! >>>>>> >>>>>> Armored catfish wreak havoc in U.S. South >>>>>> >>>>>> <http: news.yahoo.com="" blogs="" sideshow="" armored-catfish-wreaking-="" havoc-sout="">>>>>> h- >>>>>> florida- lakes-182812663.html;_ylc=X3oDMTFia2oyNjZoBF9TAzk1NDAxMDAyNwRwa2c >>>>>> Da >>>>>> WQtMjIzODM5NARzeWlkA2RfZWNoMGQ4MGQ-#more-4190> >>>>>> >>>>>> Privacy Policy >>>>>> <http: info.yahoo.com="" privacy="" us="" yahoo="" webbeacons="" details.html=""> >>>>>> >>>>>> [[alternative HTML version deleted]] >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Bioconductor mailing list >>>>>> Bioconductor at r-project.org<mailto:bioconductor at="" r-project.org=""> >>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>> Search the archives: >>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> /A model is a lie that helps you see the truth./ >>>>>> / >>>>>> / >>>>>> Howard Skipper >>>>>> <http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> /A model is a lie that helps you see the truth./ >>>>> / >>>>> / >>>>> Howard Skipper >>>>> <http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> >>>>> >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at r-project.org >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY
0
Entering edit mode
Hi Julie, FlyFactorSurvey looks great. Would that we had such a resource (curated, current, and growing) for all organisms! A few questions, if I may: 1) What role with respect to FlyFactorSurvey do you picture us taking here at BioC? How can we help? 2) Your website (http://pgfe.umassmed.edu/TFDBS) recommends meme and TOMTOM for motif comparison. Do you use them yourself? If so, can you tell us about their strengths and weaknesses? How do they compare to clover? (http://zlab.bu.edu/clover/) In that same spirit -- trying to find out more about this topic -- here are some more questions: 3) The JASPAR database seems to be mostly unchanged since 2009. (http://jaspar.genereg.net/html/DOWNLOAD). Does anyone know their update policy? 4) Is TRANSFAC only for license holders? 5) Are there any other organism-specific gems like FlyFactorSurvey to be discovered out on the web? Thanks! - Paul On Apr 24, 2012, at 3:16 PM, Zhu, Lihua (Julie) wrote: > Paul, > > Thanks so much for the comprehensive summary of existing capability of Bioc > and other resources for motif discovery and matching! > > Here is my response to your great initiative to collect use cases and open > data resources. > > Here is an open data source for Drosophila which we developed: > http://pgfe.umassmed.edu/TFDBS/ > http://nar.oxfordjournals.org/content/early/2010/11/19/nar.gkq858.full > > As you pointed out, there are several excellent Bioconductor packages > available for the two common cases of motif problems, i.e., de nova motif > discovery and motif matching to known motifs. It would be useful to have > more motif databases available for motif comparison program such as MotIV. > In addition, we use clover to search for known motifs in a given set of > sequences. > > Many thanks for sharing your insights! > > Best regards, > > Julie > > > On 4/24/12 3:02 PM, "Paul Shannon" <pshannon at="" fhcrc.org=""> wrote: > >> The recent flurry of interest in sequence motifs here on the bioc list >> suggests to us that maybe we at Bioconductor could strengthen our >> infrastructure for this kind of work. If this work interests you -- either as >> a package creator, or as a package user -- please suggest ideas or use cases. >> What do you need? I will collect and collate the responses. We hope to >> identify places where Bioc can help out. >> >> For background: we already have a number of packages (rGADEM, MotIV, cosmo, >> BCRANK, motifRG) which address, with different strengths, what I believe to be >> the two aspects of the motif problem: >> >> 1) Detecting enriched motifs in DNA sequence, or in ChIP-seq data (rGADEM, >> cosmo, motifRG, BCRANK) >> 2) Predicting the sequence motifs which bind to these enriched motifs, and >> what binding molecules they belong to (MotIV) >> >> In the past, a lot of sequence motif/binding work has addressed the search for >> transcription factor binding sites and their cognate transcription factors. >> miRNAs, phorphorylation and methylation all pose related problems. Is there >> support which we can practically offer here as well? >> >> In addition to Bioc packages, there are of course many worthwhile websites and >> external tools: JASPAR, meme, STAMP (and TRANSFAC, for those with a license). >> Nooshin mentioned the arabidopsis-specific 'AthaMap' (http://www.athamap.de). >> Are there other open-source data repositories like this for other organisms? >> c.elegans, as Julie requested? >> >> Questions, suggestions, use cases and data sources are all welcome. >> >> Thanks! >> >> - Paul >> >> >> >> >> On Apr 24, 2012, at 10:47 AM, Zhu, Lihua (Julie) wrote: >> >>> Eloi, >>> >>> I would like to use MotIV for a c.elegans dataset. What data source would >>> you recommend for matchMotif? Many thanks for your help! >>> >>> Best regards, >>> >>> Julie >>> >>> >>> On 4/24/12 1:28 PM, "Mercier Eloi" <emercier at="" chibi.ubc.ca=""> wrote: >>> >>>> Hello, >>>> >>>> I am one of the developer of MotIV. I will be happy to help you if you >>>> have any question regarding the package. >>>> >>>> First, I want to mention that in the Plos One paper, we used PICS, >>>> rGADEM and MotIV as a pipeline but MotIV can be use as a stand alone. >>>> Some of the advanced functions won't be available though. >>>> >>>> Since the PWMs in MotIV correspond to human TF, you may have to use your >>>> own list of PWMs. What MotIV needs is a simple list of matrices >>>> (head(jaspar) to view the format). >>>> Jaspar's PWMs can be easily downloaded but it seems it only contains ~20 >>>> motifs. On the other hand, AthaMap has more motifs but I did not manage >>>> to find an easy way to get them. Another place to look at is the AGRIS >>>> website (http://arabidopsis.med.ohio-state.edu/downloads.html). >>>> >>>> If you're only interested by the identification of the motifs and do not >>>> want to do further analysis with R, I recommend you to look at >>>> http://www.benoslab.pitt.edu/stamp for the identification of your motifs. >>>> >>>> Regards, >>>> >>>> Eloi Mercier >>>> >>>> >>>> On 12-04-24 07:36 AM, nooshin wrote: >>>>> Thanks a lot for your suggestion. I will for sure have a look and inform >>>>> you. >>>>> Bests, >>>>> Nooshin >>>>> >>>>> >>>>> On 04/24/2012 04:15 PM, Tim Triche, Jr. wrote: >>>>>> Ah, I see. GSL is a useful library to have installed regardless. >>>>>> Hope things work out. I found your exchanges with Paul to be useful >>>>>> reading, but obviously I was not reading closely enough, since Paul >>>>>> started off his code sample with biocLite('MotIV'). Oops :-o >>>>>> >>>>>> Here is a paper that I found interesting, which does go into some >>>>>> detail towards a "bulk" approach, from Gottardo's group: >>>>>> >>>>>> http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.p one.0016432 >>>> >>>>>> Perhaps it will be useful to you as well, would be curious to hear if so. >>>>>> >>>>>> --t >>>>>> >>>>>> On Tue, Apr 24, 2012 at 7:00 AM, nooshin<n_omranian at="" yahoo.com="">>>>>> <mailto:n_omranian at="" yahoo.com="">> wrote: >>>>>> >>>>>> >>>>>> Thanks, it's been already solved, it needs GSL package, which is a >>>>>> bit problematic, but I solved it already. >>>>>> >>>>>> But it does include only 5 matrices (in the webpage) for >>>>>> arabidopsis and in the package also! >>>>>> I'm downloading manually from AthaMap! >>>>>> >>>>>> Thanks again and keep waiting for 'bulk' approach. >>>>>> >>>>>> Bests, >>>>>> Nooshin >>>>>> >>>>>> >>>>>> On 04/24/2012 03:16 PM, Tim Triche, Jr. wrote: >>>>>>> source("http://bioconductor.org/biocLite.R") >>>>>>> biocLite("MotIV") >>>>>>> >>>>>>> ought to do the trick for you >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Tue, Apr 24, 2012 at 1:01 AM, nooshin<n_omranian at="" yahoo.com="">>>>>>> <mailto:n_omranian at="" yahoo.com="">> wrote: >>>>>>> >>>>>>> >>>>>>> Hi Paul, >>>>>>> >>>>>>> Thanks a lot. >>>>>>> I forgot to include bioc, since I only replied to you (no to >>>>>>> all). >>>>>>> >>>>>>> I can"t install MotIV package to check. I checked in google but I >>>>>>> couldn't find any solution! Do you have any suggestion for >>>>>>> installing >>>>>>> this package? >>>>>>> >>>>>>> Bests, >>>>>>> Nooshin >>>>>>> >>>>>>> On 04/23/2012 06:35 PM, Paul Shannon wrote: >>>>>>>> (redirecting this back to the Bioc list...) >>>>>>>> >>>>>>>> Hi Nooshin, >>>>>>>> >>>>>>>> The 'bulk' approach is not quite so ready as I predicted. >>>>>>> I might have something by the end of the week. >>>>>>>> >>>>>>>> As for mapping between PWMs and TFs, I have most often done >>>>>>> this with 'tom-tom' from the meme website. >>>>>>>> >>>>>>>> But I just discovered what looks like a good -- maybe >>>>>>> better -- approach: the Bioconductor MotIV package, which >>>>>>> includes a 2010 version of jasper. >>>>>>>> Try this: >>>>>>>> >>>>>>>> source("http://bioconductor.org/biocLite.R") >>>>>>>> >>>>>>>> biocLite ('MotIV') >>>>>>>> library (MotIV); >>>>>>>> browseVignettes ('MotIV') >>>>>>>> >>>>>>>> The jaspar data in this package has 130 TF-PWM mappings, >>>>>>> which appear to be human. More must be known, and publicly >>>>>>> available. The JASPAR website has a 'JASPAR CORE Plantae' >>>>>>> data set that >>>>>>>> - is probably what you are interested in >>>>>>>> - might be downloadable, and convertible to the form >>>>>>> MotIV wants. >>>>>>>> >>>>>>>> Perhaps other readers of the list have other suggestions. >>>>>>>> >>>>>>>> If you have any questions on this, please include 'BioC' in >>>>>>> your reply, so that we can all get better at this! >>>>>>>> >>>>>>>> - Paul >>>>>>>> >>>>>>>> >>>>>>>> On Apr 23, 2012, at 6:53 AM, nooshin wrote: >>>>>>>> >>>>>>>>> Hi Paul, >>>>>>>>> >>>>>>>>> Many thanks for your comprehensive information and code! >>>>>>>>> I have a question regarding to extract of PWMs. How and >>>>>>> where I can download these matrices for all TFs that PWM is >>>>>>> available for them? I need it only for Arabidopsis thaliana. >>>>>>>>> Is there any package in R which I can give the TF and >>>>>>> receive the PWM for it? Or any online database which I can >>>>>>> download from it? I have a big problem since Friday to find >>>>>>> out these matrices for different TFs of A.th. That would be >>>>>>> so great if you can help me to get these matrices. >>>>>>>>> >>>>>>>>>> If you want to do this in bulk, Herve' has some lovely >>>>>>> code to make that efficient. >>>>>>>>> Also can I have this? :) >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks a lot in advance. >>>>>>>>> Best regards, >>>>>>>>> Nooshin >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> *TODAY*/(Beta) /*.*Powered by Yahoo! >>>>>>> >>>>>>> Armored catfish wreak havoc in U.S. South >>>>>>> >>>>>>> <http: news.yahoo.com="" blogs="" sideshow="" armored-catfish-="" wreaking-havoc-sout="">>>>>>> h- >>>>>>> florida- lakes-182812663.html;_ylc=X3oDMTFia2oyNjZoBF9TAzk1NDAxMDAyNwRwa2c >>>>>>> Da >>>>>>> WQtMjIzODM5NARzeWlkA2RfZWNoMGQ4MGQ-#more-4190> >>>>>>> >>>>>>> Privacy Policy >>>>>>> <http: info.yahoo.com="" privacy="" us="" yahoo="" webbeacons="" details.html=""> >>>>>>> >>>>>>> [[alternative HTML version deleted]] >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Bioconductor mailing list >>>>>>> Bioconductor at r-project.org<mailto:bioconductor at="" r-project.org=""> >>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>>> Search the archives: >>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> /A model is a lie that helps you see the truth./ >>>>>>> / >>>>>>> / >>>>>>> Howard Skipper >>>>>>> <http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> /A model is a lie that helps you see the truth./ >>>>>> / >>>>>> / >>>>>> Howard Skipper >>>>>> <http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> >>>>>> >>>>> >>>>> [[alternative HTML version deleted]] >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Bioconductor mailing list >>>>> Bioconductor at r-project.org >>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>> Search the archives: >>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>> >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > >
ADD REPLY
0
Entering edit mode
Hi All, I followed all your emails, Many thanks for the nice information you shared. 3) The JASPAR database seems to be mostly unchanged since 2009. (http://jaspar.genereg.net/html/DOWNLOAD). Does anyone know their update policy? No, I don't know, I just checked for arabidobsis thaliana and could find only FIVE genes! 4) Is TRANSFAC only for license holders? Yes, one should pay for it in order to be able to use! 5) Are there any other organism-specific gems like FlyFactorSurvey to be discovered out on the web? This is motif sampler, which I found before, I'm not sure if this works like FlyFactorSurvey or not http://homes.esat.kuleuven.be/~sistawww/bioi/thijs/Work/MotifSampler.h tml Many thanks Paul for trying to clarify all these points! Bests, Nooshin On 04/25/2012 05:02 AM, Paul Shannon wrote: > Hi Julie, > > FlyFactorSurvey looks great. Would that we had such a resource (curated, current, and growing) for all organisms! > > A few questions, if I may: > > 1) What role with respect to FlyFactorSurvey do you picture us taking here at BioC? How can we help? > > 2) Your website (http://pgfe.umassmed.edu/TFDBS) recommends meme and TOMTOM for motif comparison. Do you use them yourself? If so, can you tell us about their strengths and weaknesses? How do they compare to clover? (http://zlab.bu.edu/clover/) > > In that same spirit -- trying to find out more about this topic -- here are some more questions: > > 3) The JASPAR database seems to be mostly unchanged since 2009. > (http://jaspar.genereg.net/html/DOWNLOAD). Does anyone know their update policy? > > 4) Is TRANSFAC only for license holders? > > 5) Are there any other organism-specific gems like FlyFactorSurvey to be discovered out on the web? > > Thanks! > > - Paul > > On Apr 24, 2012, at 3:16 PM, Zhu, Lihua (Julie) wrote: > >> Paul, >> >> Thanks so much for the comprehensive summary of existing capability of Bioc >> and other resources for motif discovery and matching! >> >> Here is my response to your great initiative to collect use cases and open >> data resources. >> >> Here is an open data source for Drosophila which we developed: >> http://pgfe.umassmed.edu/TFDBS/ >> http://nar.oxfordjournals.org/content/early/2010/11/19/nar.gkq858.full >> >> As you pointed out, there are several excellent Bioconductor packages >> available for the two common cases of motif problems, i.e., de nova motif >> discovery and motif matching to known motifs. It would be useful to have >> more motif databases available for motif comparison program such as MotIV. >> In addition, we use clover to search for known motifs in a given set of >> sequences. >> >> Many thanks for sharing your insights! >> >> Best regards, >> >> Julie >> >> >> On 4/24/12 3:02 PM, "Paul Shannon"<pshannon at="" fhcrc.org=""> wrote: >> >>> The recent flurry of interest in sequence motifs here on the bioc list >>> suggests to us that maybe we at Bioconductor could strengthen our >>> infrastructure for this kind of work. If this work interests you -- either as >>> a package creator, or as a package user -- please suggest ideas or use cases. >>> What do you need? I will collect and collate the responses. We hope to >>> identify places where Bioc can help out. >>> >>> For background: we already have a number of packages (rGADEM, MotIV, cosmo, >>> BCRANK, motifRG) which address, with different strengths, what I believe to be >>> the two aspects of the motif problem: >>> >>> 1) Detecting enriched motifs in DNA sequence, or in ChIP-seq data (rGADEM, >>> cosmo, motifRG, BCRANK) >>> 2) Predicting the sequence motifs which bind to these enriched motifs, and >>> what binding molecules they belong to (MotIV) >>> >>> In the past, a lot of sequence motif/binding work has addressed the search for >>> transcription factor binding sites and their cognate transcription factors. >>> miRNAs, phorphorylation and methylation all pose related problems. Is there >>> support which we can practically offer here as well? >>> >>> In addition to Bioc packages, there are of course many worthwhile websites and >>> external tools: JASPAR, meme, STAMP (and TRANSFAC, for those with a license). >>> Nooshin mentioned the arabidopsis-specific 'AthaMap' (http://www.athamap.de). >>> Are there other open-source data repositories like this for other organisms? >>> c.elegans, as Julie requested? >>> >>> Questions, suggestions, use cases and data sources are all welcome. >>> >>> Thanks! >>> >>> - Paul >>> >>> >>> >>> >>> On Apr 24, 2012, at 10:47 AM, Zhu, Lihua (Julie) wrote: >>> >>>> Eloi, >>>> >>>> I would like to use MotIV for a c.elegans dataset. What data source would >>>> you recommend for matchMotif? Many thanks for your help! >>>> >>>> Best regards, >>>> >>>> Julie >>>> >>>> >>>> On 4/24/12 1:28 PM, "Mercier Eloi"<emercier at="" chibi.ubc.ca=""> wrote: >>>> >>>>> Hello, >>>>> >>>>> I am one of the developer of MotIV. I will be happy to help you if you >>>>> have any question regarding the package. >>>>> >>>>> First, I want to mention that in the Plos One paper, we used PICS, >>>>> rGADEM and MotIV as a pipeline but MotIV can be use as a stand alone. >>>>> Some of the advanced functions won't be available though. >>>>> >>>>> Since the PWMs in MotIV correspond to human TF, you may have to use your >>>>> own list of PWMs. What MotIV needs is a simple list of matrices >>>>> (head(jaspar) to view the format). >>>>> Jaspar's PWMs can be easily downloaded but it seems it only contains ~20 >>>>> motifs. On the other hand, AthaMap has more motifs but I did not manage >>>>> to find an easy way to get them. Another place to look at is the AGRIS >>>>> website (http://arabidopsis.med.ohio-state.edu/downloads.html). >>>>> >>>>> If you're only interested by the identification of the motifs and do not >>>>> want to do further analysis with R, I recommend you to look at >>>>> http://www.benoslab.pitt.edu/stamp for the identification of your motifs. >>>>> >>>>> Regards, >>>>> >>>>> Eloi Mercier >>>>> >>>>> >>>>> On 12-04-24 07:36 AM, nooshin wrote: >>>>>> Thanks a lot for your suggestion. I will for sure have a look and inform >>>>>> you. >>>>>> Bests, >>>>>> Nooshin >>>>>> >>>>>> >>>>>> On 04/24/2012 04:15 PM, Tim Triche, Jr. wrote: >>>>>>> Ah, I see. GSL is a useful library to have installed regardless. >>>>>>> Hope things work out. I found your exchanges with Paul to be useful >>>>>>> reading, but obviously I was not reading closely enough, since Paul >>>>>>> started off his code sample with biocLite('MotIV'). Oops :-o >>>>>>> >>>>>>> Here is a paper that I found interesting, which does go into some >>>>>>> detail towards a "bulk" approach, from Gottardo's group: >>>>>>> >>>>>>> http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal. pone.0016432 >>>>>>> Perhaps it will be useful to you as well, would be curious to hear if so. >>>>>>> >>>>>>> --t >>>>>>> >>>>>>> On Tue, Apr 24, 2012 at 7:00 AM, nooshin<n_omranian at="" yahoo.com="">>>>>>> <mailto:n_omranian at="" yahoo.com="">> wrote: >>>>>>> >>>>>>> >>>>>>> Thanks, it's been already solved, it needs GSL package, which is a >>>>>>> bit problematic, but I solved it already. >>>>>>> >>>>>>> But it does include only 5 matrices (in the webpage) for >>>>>>> arabidopsis and in the package also! >>>>>>> I'm downloading manually from AthaMap! >>>>>>> >>>>>>> Thanks again and keep waiting for 'bulk' approach. >>>>>>> >>>>>>> Bests, >>>>>>> Nooshin >>>>>>> >>>>>>> >>>>>>> On 04/24/2012 03:16 PM, Tim Triche, Jr. wrote: >>>>>>>> source("http://bioconductor.org/biocLite.R") >>>>>>>> biocLite("MotIV") >>>>>>>> >>>>>>>> ought to do the trick for you >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Apr 24, 2012 at 1:01 AM, nooshin<n_omranian at="" yahoo.com="">>>>>>>> <mailto:n_omranian at="" yahoo.com="">> wrote: >>>>>>>> >>>>>>>> >>>>>>>> Hi Paul, >>>>>>>> >>>>>>>> Thanks a lot. >>>>>>>> I forgot to include bioc, since I only replied to you (no to >>>>>>>> all). >>>>>>>> >>>>>>>> I can"t install MotIV package to check. I checked in google but I >>>>>>>> couldn't find any solution! Do you have any suggestion for >>>>>>>> installing >>>>>>>> this package? >>>>>>>> >>>>>>>> Bests, >>>>>>>> Nooshin >>>>>>>> >>>>>>>> On 04/23/2012 06:35 PM, Paul Shannon wrote: >>>>>>>>> (redirecting this back to the Bioc list...) >>>>>>>>> >>>>>>>>> Hi Nooshin, >>>>>>>>> >>>>>>>>> The 'bulk' approach is not quite so ready as I predicted. >>>>>>>> I might have something by the end of the week. >>>>>>>>> As for mapping between PWMs and TFs, I have most often done >>>>>>>> this with 'tom-tom' from the meme website. >>>>>>>>> But I just discovered what looks like a good -- maybe >>>>>>>> better -- approach: the Bioconductor MotIV package, which >>>>>>>> includes a 2010 version of jasper. >>>>>>>>> Try this: >>>>>>>>> >>>>>>>>> source("http://bioconductor.org/biocLite.R") >>>>>>>>> >>>>>>>>> biocLite ('MotIV') >>>>>>>>> library (MotIV); >>>>>>>>> browseVignettes ('MotIV') >>>>>>>>> >>>>>>>>> The jaspar data in this package has 130 TF-PWM mappings, >>>>>>>> which appear to be human. More must be known, and publicly >>>>>>>> available. The JASPAR website has a 'JASPAR CORE Plantae' >>>>>>>> data set that >>>>>>>>> - is probably what you are interested in >>>>>>>>> - might be downloadable, and convertible to the form >>>>>>>> MotIV wants. >>>>>>>>> Perhaps other readers of the list have other suggestions. >>>>>>>>> >>>>>>>>> If you have any questions on this, please include 'BioC' in >>>>>>>> your reply, so that we can all get better at this! >>>>>>>>> - Paul >>>>>>>>> >>>>>>>>> >>>>>>>>> On Apr 23, 2012, at 6:53 AM, nooshin wrote: >>>>>>>>> >>>>>>>>>> Hi Paul, >>>>>>>>>> >>>>>>>>>> Many thanks for your comprehensive information and code! >>>>>>>>>> I have a question regarding to extract of PWMs. How and >>>>>>>> where I can download these matrices for all TFs that PWM is >>>>>>>> available for them? I need it only for Arabidopsis thaliana. >>>>>>>>>> Is there any package in R which I can give the TF and >>>>>>>> receive the PWM for it? Or any online database which I can >>>>>>>> download from it? I have a big problem since Friday to find >>>>>>>> out these matrices for different TFs of A.th. That would be >>>>>>>> so great if you can help me to get these matrices. >>>>>>>>>>> If you want to do this in bulk, Herve' has some lovely >>>>>>>> code to make that efficient. >>>>>>>>>> Also can I have this? :) >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks a lot in advance. >>>>>>>>>> Best regards, >>>>>>>>>> Nooshin >>>>>>>>>> >>>>>>>>>> >>>>>>>> *TODAY*/(Beta) /*.*Powered by Yahoo! >>>>>>>> >>>>>>>> Armored catfish wreak havoc in U.S. South >>>>>>>> >>>>>>>> <http: news.yahoo.com="" blogs="" sideshow="" armored-catfish-="" wreaking-havoc-sout="">>>>>>>> h- >>>>>>>> florida- lakes-182812663.html;_ylc=X3oDMTFia2oyNjZoBF9TAzk1NDAxMDAyNwRwa2c >>>>>>>> Da >>>>>>>> WQtMjIzODM5NARzeWlkA2RfZWNoMGQ4MGQ-#more-4190> >>>>>>>> >>>>>>>> Privacy Policy >>>>>>>> <http: info.yahoo.com="" privacy="" us="" yahoo="" webbeacons="" details.html=""> >>>>>>>> >>>>>>>> [[alternative HTML version deleted]] >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Bioconductor mailing list >>>>>>>> Bioconductor at r-project.org<mailto:bioconductor at="" r-project.org=""> >>>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>>>> Search the archives: >>>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> /A model is a lie that helps you see the truth./ >>>>>>>> / >>>>>>>> / >>>>>>>> Howard Skipper >>>>>>>> <http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> /A model is a lie that helps you see the truth./ >>>>>>> / >>>>>>> / >>>>>>> Howard Skipper >>>>>>> <http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> >>>>>>> >>>>>> [[alternative HTML version deleted]] >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Bioconductor mailing list >>>>>> Bioconductor at r-project.org >>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>> Search the archives: >>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at r-project.org >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
Paul, Thanks for the positive feedback on FlyFactorSurvey! The motifs in this database are generated using the bacterial one-hybrid method (B1H and B1H-seq). All the public motifs can be downloaded freely. It would be useful to have a Bioc data package, containing curated and current motifs from all organisms if available, that interfaces with MotiV. MEME works very well in finding motifs from B1H-seq data (Christensen et al.,Nucleic Acid Research 2011, Vol39, No.12 e83), although only limited motif discovery tools were compared in the paper. Currently, we are working on whether motif discovery can be improved with B1H-seq data. As I understand, MEME is for de nova motif discovery, TOMTOM and STAMP are for testing whether the motif returned by a motif finder is significantly similar to a known motif, clover is for searching known motifs in a given set of sequences. We are thinking of adding clover to our website. I am looking forward to your collated survey results. Best regards, Julie On 4/24/12 11:02 PM, "Paul Shannon" <pshannon at="" fhcrc.org=""> wrote: > Hi Julie, > > FlyFactorSurvey looks great. Would that we had such a resource (curated, > current, and growing) for all organisms! > > A few questions, if I may: > > 1) What role with respect to FlyFactorSurvey do you picture us taking here > at BioC? How can we help? > > 2) Your website (http://pgfe.umassmed.edu/TFDBS) recommends meme and TOMTOM > for motif comparison. Do you use them yourself? If so, can you tell us about > their strengths and weaknesses? How do they compare to clover? > (http://zlab.bu.edu/clover/) > > In that same spirit -- trying to find out more about this topic -- here are > some more questions: > > 3) The JASPAR database seems to be mostly unchanged since 2009. > (http://jaspar.genereg.net/html/DOWNLOAD). Does anyone know their update > policy? > > 4) Is TRANSFAC only for license holders? > > 5) Are there any other organism-specific gems like FlyFactorSurvey to be > discovered out on the web? > > Thanks! > > - Paul > > On Apr 24, 2012, at 3:16 PM, Zhu, Lihua (Julie) wrote: > >> Paul, >> >> Thanks so much for the comprehensive summary of existing capability of Bioc >> and other resources for motif discovery and matching! >> >> Here is my response to your great initiative to collect use cases and open >> data resources. >> >> Here is an open data source for Drosophila which we developed: >> http://pgfe.umassmed.edu/TFDBS/ >> http://nar.oxfordjournals.org/content/early/2010/11/19/nar.gkq858.full >> >> As you pointed out, there are several excellent Bioconductor packages >> available for the two common cases of motif problems, i.e., de nova motif >> discovery and motif matching to known motifs. It would be useful to have >> more motif databases available for motif comparison program such as MotIV. >> In addition, we use clover to search for known motifs in a given set of >> sequences. >> >> Many thanks for sharing your insights! >> >> Best regards, >> >> Julie >> >> >> On 4/24/12 3:02 PM, "Paul Shannon" <pshannon at="" fhcrc.org=""> wrote: >> >>> The recent flurry of interest in sequence motifs here on the bioc list >>> suggests to us that maybe we at Bioconductor could strengthen our >>> infrastructure for this kind of work. If this work interests you -- either >>> as >>> a package creator, or as a package user -- please suggest ideas or use >>> cases. >>> What do you need? I will collect and collate the responses. We hope to >>> identify places where Bioc can help out. >>> >>> For background: we already have a number of packages (rGADEM, MotIV, cosmo, >>> BCRANK, motifRG) which address, with different strengths, what I believe to >>> be >>> the two aspects of the motif problem: >>> >>> 1) Detecting enriched motifs in DNA sequence, or in ChIP-seq data (rGADEM, >>> cosmo, motifRG, BCRANK) >>> 2) Predicting the sequence motifs which bind to these enriched motifs, and >>> what binding molecules they belong to (MotIV) >>> >>> In the past, a lot of sequence motif/binding work has addressed the search >>> for >>> transcription factor binding sites and their cognate transcription factors. >>> miRNAs, phorphorylation and methylation all pose related problems. Is there >>> support which we can practically offer here as well? >>> >>> In addition to Bioc packages, there are of course many worthwhile websites >>> and >>> external tools: JASPAR, meme, STAMP (and TRANSFAC, for those with a >>> license). >>> Nooshin mentioned the arabidopsis-specific 'AthaMap' >>> (http://www.athamap.de). >>> Are there other open-source data repositories like this for other organisms? >>> c.elegans, as Julie requested? >>> >>> Questions, suggestions, use cases and data sources are all welcome. >>> >>> Thanks! >>> >>> - Paul >>> >>> >>> >>> >>> On Apr 24, 2012, at 10:47 AM, Zhu, Lihua (Julie) wrote: >>> >>>> Eloi, >>>> >>>> I would like to use MotIV for a c.elegans dataset. What data source would >>>> you recommend for matchMotif? Many thanks for your help! >>>> >>>> Best regards, >>>> >>>> Julie >>>> >>>> >>>> On 4/24/12 1:28 PM, "Mercier Eloi" <emercier at="" chibi.ubc.ca=""> wrote: >>>> >>>>> Hello, >>>>> >>>>> I am one of the developer of MotIV. I will be happy to help you if you >>>>> have any question regarding the package. >>>>> >>>>> First, I want to mention that in the Plos One paper, we used PICS, >>>>> rGADEM and MotIV as a pipeline but MotIV can be use as a stand alone. >>>>> Some of the advanced functions won't be available though. >>>>> >>>>> Since the PWMs in MotIV correspond to human TF, you may have to use your >>>>> own list of PWMs. What MotIV needs is a simple list of matrices >>>>> (head(jaspar) to view the format). >>>>> Jaspar's PWMs can be easily downloaded but it seems it only contains ~20 >>>>> motifs. On the other hand, AthaMap has more motifs but I did not manage >>>>> to find an easy way to get them. Another place to look at is the AGRIS >>>>> website (http://arabidopsis.med.ohio-state.edu/downloads.html). >>>>> >>>>> If you're only interested by the identification of the motifs and do not >>>>> want to do further analysis with R, I recommend you to look at >>>>> http://www.benoslab.pitt.edu/stamp for the identification of your motifs. >>>>> >>>>> Regards, >>>>> >>>>> Eloi Mercier >>>>> >>>>> >>>>> On 12-04-24 07:36 AM, nooshin wrote: >>>>>> Thanks a lot for your suggestion. I will for sure have a look and inform >>>>>> you. >>>>>> Bests, >>>>>> Nooshin >>>>>> >>>>>> >>>>>> On 04/24/2012 04:15 PM, Tim Triche, Jr. wrote: >>>>>>> Ah, I see. GSL is a useful library to have installed regardless. >>>>>>> Hope things work out. I found your exchanges with Paul to be useful >>>>>>> reading, but obviously I was not reading closely enough, since Paul >>>>>>> started off his code sample with biocLite('MotIV'). Oops :-o >>>>>>> >>>>>>> Here is a paper that I found interesting, which does go into some >>>>>>> detail towards a "bulk" approach, from Gottardo's group: >>>>>>> >>>>>>> http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal. pone.00164 >>>>>>> 32 >>>>> >>>>>>> Perhaps it will be useful to you as well, would be curious to hear if >>>>>>> so. >>>>>>> >>>>>>> --t >>>>>>> >>>>>>> On Tue, Apr 24, 2012 at 7:00 AM, nooshin<n_omranian at="" yahoo.com="">>>>>>> <mailto:n_omranian at="" yahoo.com="">> wrote: >>>>>>> >>>>>>> >>>>>>> Thanks, it's been already solved, it needs GSL package, which is a >>>>>>> bit problematic, but I solved it already. >>>>>>> >>>>>>> But it does include only 5 matrices (in the webpage) for >>>>>>> arabidopsis and in the package also! >>>>>>> I'm downloading manually from AthaMap! >>>>>>> >>>>>>> Thanks again and keep waiting for 'bulk' approach. >>>>>>> >>>>>>> Bests, >>>>>>> Nooshin >>>>>>> >>>>>>> >>>>>>> On 04/24/2012 03:16 PM, Tim Triche, Jr. wrote: >>>>>>>> source("http://bioconductor.org/biocLite.R") >>>>>>>> biocLite("MotIV") >>>>>>>> >>>>>>>> ought to do the trick for you >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Apr 24, 2012 at 1:01 AM, nooshin<n_omranian at="" yahoo.com="">>>>>>>> <mailto:n_omranian at="" yahoo.com="">> wrote: >>>>>>>> >>>>>>>> >>>>>>>> Hi Paul, >>>>>>>> >>>>>>>> Thanks a lot. >>>>>>>> I forgot to include bioc, since I only replied to you (no to >>>>>>>> all). >>>>>>>> >>>>>>>> I can"t install MotIV package to check. I checked in google but >>>>>>>> I >>>>>>>> couldn't find any solution! Do you have any suggestion for >>>>>>>> installing >>>>>>>> this package? >>>>>>>> >>>>>>>> Bests, >>>>>>>> Nooshin >>>>>>>> >>>>>>>> On 04/23/2012 06:35 PM, Paul Shannon wrote: >>>>>>>>> (redirecting this back to the Bioc list...) >>>>>>>>> >>>>>>>>> Hi Nooshin, >>>>>>>>> >>>>>>>>> The 'bulk' approach is not quite so ready as I predicted. >>>>>>>> I might have something by the end of the week. >>>>>>>>> >>>>>>>>> As for mapping between PWMs and TFs, I have most often done >>>>>>>> this with 'tom-tom' from the meme website. >>>>>>>>> >>>>>>>>> But I just discovered what looks like a good -- maybe >>>>>>>> better -- approach: the Bioconductor MotIV package, which >>>>>>>> includes a 2010 version of jasper. >>>>>>>>> Try this: >>>>>>>>> >>>>>>>>> source("http://bioconductor.org/biocLite.R") >>>>>>>>> >>>>>>>>> biocLite ('MotIV') >>>>>>>>> library (MotIV); >>>>>>>>> browseVignettes ('MotIV') >>>>>>>>> >>>>>>>>> The jaspar data in this package has 130 TF-PWM mappings, >>>>>>>> which appear to be human. More must be known, and publicly >>>>>>>> available. The JASPAR website has a 'JASPAR CORE Plantae' >>>>>>>> data set that >>>>>>>>> - is probably what you are interested in >>>>>>>>> - might be downloadable, and convertible to the form >>>>>>>> MotIV wants. >>>>>>>>> >>>>>>>>> Perhaps other readers of the list have other suggestions. >>>>>>>>> >>>>>>>>> If you have any questions on this, please include 'BioC' in >>>>>>>> your reply, so that we can all get better at this! >>>>>>>>> >>>>>>>>> - Paul >>>>>>>>> >>>>>>>>> >>>>>>>>> On Apr 23, 2012, at 6:53 AM, nooshin wrote: >>>>>>>>> >>>>>>>>>> Hi Paul, >>>>>>>>>> >>>>>>>>>> Many thanks for your comprehensive information and code! >>>>>>>>>> I have a question regarding to extract of PWMs. How and >>>>>>>> where I can download these matrices for all TFs that PWM is >>>>>>>> available for them? I need it only for Arabidopsis thaliana. >>>>>>>>>> Is there any package in R which I can give the TF and >>>>>>>> receive the PWM for it? Or any online database which I can >>>>>>>> download from it? I have a big problem since Friday to find >>>>>>>> out these matrices for different TFs of A.th. That would be >>>>>>>> so great if you can help me to get these matrices. >>>>>>>>>> >>>>>>>>>>> If you want to do this in bulk, Herve' has some lovely >>>>>>>> code to make that efficient. >>>>>>>>>> Also can I have this? :) >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks a lot in advance. >>>>>>>>>> Best regards, >>>>>>>>>> Nooshin >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> *TODAY*/(Beta) /*.*Powered by Yahoo! >>>>>>>> >>>>>>>> Armored catfish wreak havoc in U.S. South >>>>>>>> >>>>>>>> <http: news.yahoo.com="" blogs="" sideshow="" armored-catfish-="" wreaking-havoc-so="">>>>>>>> ut >>>>>>>> h- >>>>>>>> florida- lakes-182812663.html;_ylc=X3oDMTFia2oyNjZoBF9TAzk1NDAxMDAyNwRwa >>>>>>>> 2c >>>>>>>> Da >>>>>>>> WQtMjIzODM5NARzeWlkA2RfZWNoMGQ4MGQ-#more-4190> >>>>>>>> >>>>>>>> Privacy Policy >>>>>>>> <http: info.yahoo.com="" privacy="" us="" yahoo="" webbeacons="" details.html=""> >>>>>>>> >>>>>>>> [[alternative HTML version deleted]] >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Bioconductor mailing list >>>>>>>> Bioconductor at r-project.org<mailto:bioconductor at="" r-project.org=""> >>>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>>>> Search the archives: >>>>>>>> >>>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> /A model is a lie that helps you see the truth./ >>>>>>>> / >>>>>>>> / >>>>>>>> Howard Skipper >>>>>>>> <http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> /A model is a lie that helps you see the truth./ >>>>>>> / >>>>>>> / >>>>>>> Howard Skipper >>>>>>> <http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> >>>>>>> >>>>>> >>>>>> [[alternative HTML version deleted]] >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Bioconductor mailing list >>>>>> Bioconductor at r-project.org >>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>> Search the archives: >>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>> >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at r-project.org >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> >> >
ADD REPLY
0
Entering edit mode
Hi, To carry on the MEME stuff, a biostar post just pointed me to an updated scoring metric in tomtom which is made available in the latest MEME software suite: http://bioinformatics.oxfordjournals.org/content/27/12/1603.full Perhaps wrapping parts of the MEME suite into an R library would be useful, no? You might find the FIRE (and FIRE-pro) suite of tools also useful for motif discovery, as welll: http://physiology.med.cornell.edu/faculty/elemento/lab/software.shtml Related to that, S. Tavazoie gave a talk at the recent CSHL/sysbio meeting and presented TEISER, which seems pretty cool if you're looking for structural motifs: https://tavazoielab.c2b2.columbia.edu/TEISER/ -steve On Wed, Apr 25, 2012 at 9:44 AM, Zhu, Lihua (Julie) <julie.zhu at="" umassmed.edu=""> wrote: > Paul, > > Thanks for the positive feedback on FlyFactorSurvey! The motifs in this > database are generated using the bacterial one-hybrid method (B1H and > B1H-seq). All the public motifs can be downloaded freely. It would be useful > to have a Bioc data package, containing curated and current motifs from all > organisms if available, that interfaces with MotiV. > > MEME works very well in finding motifs from B1H-seq data (Christensen et > al.,Nucleic Acid Research 2011, Vol39, No.12 e83), although only limited > motif discovery tools were compared in the paper. Currently, we are working > on whether motif discovery can be improved with B1H-seq data. > > As I understand, MEME is for de nova motif discovery, TOMTOM and STAMP are > for testing whether the motif returned by a motif finder is significantly > similar to a known motif, clover is for searching known motifs in a given > set of sequences. We are thinking of adding clover to our website. > > I am looking forward to your collated survey results. > > Best regards, > > Julie > > > On 4/24/12 11:02 PM, "Paul Shannon" <pshannon at="" fhcrc.org=""> wrote: > >> Hi Julie, >> >> FlyFactorSurvey looks great. ? Would that we had such a resource (curated, >> current, and growing) for all organisms! >> >> A few questions, if I may: >> >> ? 1) What role with respect to FlyFactorSurvey do you picture us taking here >> at BioC? ?How can we help? >> >> ? 2) Your website (http://pgfe.umassmed.edu/TFDBS) recommends meme and TOMTOM >> for motif comparison. ?Do you use them yourself? ?If so, can you tell us about >> their strengths and weaknesses? ?How do they compare to clover? >> (http://zlab.bu.edu/clover/) >> >> In that same spirit -- trying to find out more about this topic -- here are >> some more questions: >> >> ? 3) The JASPAR database seems to be mostly unchanged since 2009. >> ? ? ?(http://jaspar.genereg.net/html/DOWNLOAD). Does anyone know their update >> policy? >> >> ? 4) Is TRANSFAC only for license holders? >> >> ? 5) Are there any other organism-specific gems like FlyFactorSurvey to be >> discovered out on the web? >> >> Thanks! >> >> ?- Paul >> >> On Apr 24, 2012, at 3:16 PM, Zhu, Lihua (Julie) wrote: >> >>> Paul, >>> >>> Thanks so much for the comprehensive summary of existing capability of Bioc >>> and other resources for motif discovery and matching! >>> >>> Here is my response to your great initiative to collect use cases and open >>> data resources. >>> >>> Here is an open data source for Drosophila which we developed: >>> http://pgfe.umassmed.edu/TFDBS/ >>> http://nar.oxfordjournals.org/content/early/2010/11/19/nar.gkq858.full >>> >>> As you pointed out, there are several excellent Bioconductor packages >>> available for the two common cases of motif problems, i.e., de nova motif >>> discovery and motif matching to known motifs. It would be useful to have >>> more motif databases available for motif comparison program such as MotIV. >>> In addition, we use clover to search for known motifs in a given set of >>> sequences. >>> >>> Many thanks for sharing your insights! >>> >>> Best regards, >>> >>> Julie >>> >>> >>> On 4/24/12 3:02 PM, "Paul Shannon" <pshannon at="" fhcrc.org=""> wrote: >>> >>>> The recent flurry of interest in sequence motifs here on the bioc list >>>> suggests to us that maybe we at Bioconductor could strengthen our >>>> infrastructure for this kind of work. ?If this work interests you -- either >>>> as >>>> a package creator, or as a package user -- please suggest ideas or use >>>> cases. >>>> What do you need? ?I will collect and collate the responses. ? We hope to >>>> identify places where Bioc can help out. >>>> >>>> For background: ?we already have a number of packages (rGADEM, MotIV, cosmo, >>>> BCRANK, motifRG) which address, with different strengths, what I believe to >>>> be >>>> the two aspects of the motif problem: >>>> >>>> ?1) Detecting enriched motifs in DNA sequence, or in ChIP-seq data ?(rGADEM, >>>> cosmo, motifRG, BCRANK) >>>> ?2) Predicting the sequence motifs which bind to these enriched motifs, and >>>> what binding molecules they belong to (MotIV) >>>> >>>> In the past, a lot of sequence motif/binding work has addressed the search >>>> for >>>> transcription factor binding sites and their cognate transcription factors. >>>> miRNAs, phorphorylation and methylation all pose related problems. ?Is there >>>> support which we can practically offer here as well? >>>> >>>> In addition to Bioc packages, there are of course many worthwhile websites >>>> and >>>> external tools: ?JASPAR, meme, STAMP (and TRANSFAC, for those with a >>>> license). >>>> Nooshin mentioned the arabidopsis-specific 'AthaMap' >>>> (http://www.athamap.de). >>>> Are there other open-source data repositories like this for other organisms? >>>> c.elegans, as Julie requested? >>>> >>>> Questions, suggestions, use cases and data sources are all welcome. >>>> >>>> Thanks! >>>> >>>> - Paul >>>> >>>> >>>> >>>> >>>> On Apr 24, 2012, at 10:47 AM, Zhu, Lihua (Julie) wrote: >>>> >>>>> Eloi, >>>>> >>>>> I would like to use MotIV for a c.elegans dataset. What data source would >>>>> you recommend for matchMotif? Many thanks for your help! >>>>> >>>>> Best regards, >>>>> >>>>> Julie >>>>> >>>>> >>>>> On 4/24/12 1:28 PM, "Mercier Eloi" <emercier at="" chibi.ubc.ca=""> wrote: >>>>> >>>>>> Hello, >>>>>> >>>>>> I am one of the developer of MotIV. I will be happy to help you if you >>>>>> have any question regarding the package. >>>>>> >>>>>> First, I want to mention that in the Plos One paper, we used PICS, >>>>>> rGADEM and MotIV as a pipeline but MotIV can be use as a stand alone. >>>>>> Some of the advanced functions won't be available though. >>>>>> >>>>>> Since the PWMs in MotIV correspond to human TF, you may have to use your >>>>>> own list of PWMs. What MotIV needs is a simple list of matrices >>>>>> (head(jaspar) to view the format). >>>>>> Jaspar's PWMs can be easily downloaded but it seems it only contains ~20 >>>>>> motifs. On the other hand, AthaMap has more motifs but I did not manage >>>>>> to find an easy way to get them. Another place to look at is the AGRIS >>>>>> website (http://arabidopsis.med.ohio-state.edu/downloads.html). >>>>>> >>>>>> If you're only interested by the identification of the motifs and do not >>>>>> want to do further analysis with R, I recommend you to look at >>>>>> http://www.benoslab.pitt.edu/stamp for the identification of your motifs. >>>>>> >>>>>> Regards, >>>>>> >>>>>> Eloi Mercier >>>>>> >>>>>> >>>>>> On 12-04-24 07:36 AM, nooshin wrote: >>>>>>> Thanks a lot for your suggestion. I will for sure have a look and inform >>>>>>> you. >>>>>>> Bests, >>>>>>> Nooshin >>>>>>> >>>>>>> >>>>>>> On 04/24/2012 04:15 PM, Tim Triche, Jr. wrote: >>>>>>>> Ah, I see. ?GSL is a useful library to have installed regardless. >>>>>>>> Hope things work out. ?I found your exchanges with Paul to be useful >>>>>>>> reading, but obviously I was not reading closely enough, since Paul >>>>>>>> started off his code sample with biocLite('MotIV'). ?Oops :-o >>>>>>>> >>>>>>>> Here is a paper that I found interesting, which does go into some >>>>>>>> detail towards a "bulk" approach, from Gottardo's group: >>>>>>>> >>>>>>>> http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal .pone.00164 >>>>>>>> 32 >>>>>> >>>>>>>> Perhaps it will be useful to you as well, would be curious to hear if >>>>>>>> so. >>>>>>>> >>>>>>>> --t >>>>>>>> >>>>>>>> On Tue, Apr 24, 2012 at 7:00 AM, nooshin<n_omranian at="" yahoo.com="">>>>>>>> <mailto:n_omranian at="" yahoo.com="">> ?wrote: >>>>>>>> >>>>>>>> >>>>>>>> ? ?Thanks, it's been already solved, it needs GSL package, which is a >>>>>>>> ? ?bit problematic, but I solved it already. >>>>>>>> >>>>>>>> ? ?But it does include only 5 matrices (in the webpage) for >>>>>>>> ? ?arabidopsis and in the package also! >>>>>>>> ? ?I'm downloading manually from AthaMap! >>>>>>>> >>>>>>>> ? ?Thanks again and keep waiting for 'bulk' approach. >>>>>>>> >>>>>>>> ? ?Bests, >>>>>>>> ? ?Nooshin >>>>>>>> >>>>>>>> >>>>>>>> ? ?On 04/24/2012 03:16 PM, Tim Triche, Jr. wrote: >>>>>>>>> ? ?source("http://bioconductor.org/biocLite.R") >>>>>>>>> ? ?biocLite("MotIV") >>>>>>>>> >>>>>>>>> ? ?ought to do the trick for you >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> ? ?On Tue, Apr 24, 2012 at 1:01 AM, nooshin<n_omranian at="" yahoo.com="">>>>>>>>> ? ?<mailto:n_omranian at="" yahoo.com="">> ?wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> ? ? ? ?Hi Paul, >>>>>>>>> >>>>>>>>> ? ? ? ?Thanks a lot. >>>>>>>>> ? ? ? ?I forgot to include bioc, since I only replied to you (no to >>>>>>>>> ? ? ? ?all). >>>>>>>>> >>>>>>>>> ? ? ? ?I can"t install MotIV package to check. I checked in google but >>>>>>>>> I >>>>>>>>> ? ? ? ?couldn't find any solution! Do you have any suggestion for >>>>>>>>> ? ? ? ?installing >>>>>>>>> ? ? ? ?this package? >>>>>>>>> >>>>>>>>> ? ? ? ?Bests, >>>>>>>>> ? ? ? ?Nooshin >>>>>>>>> >>>>>>>>> ? ? ? ?On 04/23/2012 06:35 PM, Paul Shannon wrote: >>>>>>>>>> (redirecting this back to the Bioc list...) >>>>>>>>>> >>>>>>>>>> Hi Nooshin, >>>>>>>>>> >>>>>>>>>> The 'bulk' approach is not quite so ready as I predicted. >>>>>>>>> ? ? ? ? I might have something by the end of the week. >>>>>>>>>> >>>>>>>>>> As for mapping between PWMs and TFs, I have most often done >>>>>>>>> ? ? ? ?this with 'tom-tom' from the meme website. >>>>>>>>>> >>>>>>>>>> But I just discovered what looks like a good -- maybe >>>>>>>>> ? ? ? ?better -- approach: ?the Bioconductor MotIV package, which >>>>>>>>> ? ? ? ?includes a 2010 version of jasper. >>>>>>>>>> Try this: >>>>>>>>>> >>>>>>>>>> ? ?source("http://bioconductor.org/biocLite.R") >>>>>>>>>> >>>>>>>>>> biocLite ('MotIV') >>>>>>>>>> library (MotIV); >>>>>>>>>> browseVignettes ('MotIV') >>>>>>>>>> >>>>>>>>>> The jaspar data in this package has 130 TF-PWM mappings, >>>>>>>>> ? ? ? ?which appear to be human. ?More must be known, and publicly >>>>>>>>> ? ? ? ?available. ?The JASPAR website has a 'JASPAR CORE Plantae' >>>>>>>>> ? ? ? ? data set that >>>>>>>>>> ? ?- is probably what you are interested in >>>>>>>>>> ? ?- might be downloadable, and convertible to the form >>>>>>>>> ? ? ? ?MotIV wants. >>>>>>>>>> >>>>>>>>>> Perhaps other readers of the list have other suggestions. >>>>>>>>>> >>>>>>>>>> If you have any questions on this, please include 'BioC' in >>>>>>>>> ? ? ? ?your reply, so that we can all get better at this! >>>>>>>>>> >>>>>>>>>> ?- Paul >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Apr 23, 2012, at 6:53 AM, nooshin wrote: >>>>>>>>>> >>>>>>>>>>> Hi Paul, >>>>>>>>>>> >>>>>>>>>>> Many thanks for your comprehensive information and code! >>>>>>>>>>> I have a question regarding to extract of PWMs. How and >>>>>>>>> ? ? ? ?where I can download these matrices for all TFs that PWM is >>>>>>>>> ? ? ? ?available for them? I need it only for Arabidopsis thaliana. >>>>>>>>>>> Is there any package in R which I can give the TF and >>>>>>>>> ? ? ? ?receive the PWM for it? Or any online database which I can >>>>>>>>> ? ? ? ?download from it? I have a big problem since Friday to find >>>>>>>>> ? ? ? ?out these matrices for different TFs of A.th. That would be >>>>>>>>> ? ? ? ?so great if you can help me to get these matrices. >>>>>>>>>>> >>>>>>>>>>>> If you want to do this in bulk, Herve' has some lovely >>>>>>>>> ? ? ? ?code to make that efficient. >>>>>>>>>>> Also can I have this? :) >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks a lot in advance. >>>>>>>>>>> Best regards, >>>>>>>>>>> Nooshin >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>>> ? ? ? ?*TODAY*/(Beta) /*.*Powered by Yahoo! >>>>>>>>> >>>>>>>>> ? ? ? ?Armored catfish wreak havoc in U.S. South >>>>>>>>> >>>>>>>>> <http: news.yahoo.com="" blogs="" sideshow="" armored-catfish-="" wreaking-havoc-so="">>>>>>>>> ut >>>>>>>>> h- >>>>>>>>> florida- lakes-182812663.html;_ylc=X3oDMTFia2oyNjZoBF9TAzk1NDAxMDAyNwRwa >>>>>>>>> 2c >>>>>>>>> Da >>>>>>>>> WQtMjIzODM5NARzeWlkA2RfZWNoMGQ4MGQ-#more-4190> >>>>>>>>> >>>>>>>>> ? ? ? ?Privacy Policy >>>>>>>>> ? ? ? ?<http: info.yahoo.com="" privacy="" us="" yahoo="" webbeacons="" details.html=""> >>>>>>>>> >>>>>>>>> ? ? ? ? ? ? ? [[alternative HTML version deleted]] >>>>>>>>> >>>>>>>>> >>>>>>>>> ? ? ? ?_______________________________________________ >>>>>>>>> ? ? ? ?Bioconductor mailing list >>>>>>>>> ? ? ? ?Bioconductor at r-project.org<mailto:bioconductor at="" r-project.org=""> >>>>>>>>> ? ? ? ?https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>>>>> ? ? ? ?Search the archives: >>>>>>>>> >>>>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> ? ?-- >>>>>>>>> ? ?/A model is a lie that helps you see the truth./ >>>>>>>>> ? ?/ >>>>>>>>> ? ?/ >>>>>>>>> ? ?Howard Skipper >>>>>>>>> ? ?<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> /A model is a lie that helps you see the truth./ >>>>>>>> / >>>>>>>> / >>>>>>>> Howard Skipper >>>>>>>> <http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> >>>>>>>> >>>>>>> >>>>>>> [[alternative HTML version deleted]] >>>>>>> >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Bioconductor mailing list >>>>>>> Bioconductor at r-project.org >>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>>> Search the archives: >>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>>> >>>>> >>>>> _______________________________________________ >>>>> Bioconductor mailing list >>>>> Bioconductor at r-project.org >>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>> Search the archives: >>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>> >>> >>> >> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Steve Lianoglou Graduate Student: Computational Systems Biology ?| Memorial Sloan-Kettering Cancer Center ?| Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
ADD REPLY
0
Entering edit mode
Regarding S. cerevisiae, there is a great resource, YEASTRACT database ( http://www.yeastract.com/). It compiles the motifs already identified, documented regulation based on wet-lab research and potential regulation based on comparison of motifs against the promoter region of genes. It is reviewed and up-dated regularly, being added new motifs and new documented regulation. pj No dia 25 de Abril de 2012 15:12, Steve Lianoglou < mailinglist.honeypot@gmail.com> escreveu: > Hi, > > To carry on the MEME stuff, a biostar post just pointed me to an > updated scoring metric in tomtom which is made available in the latest > MEME software suite: > > http://bioinformatics.oxfordjournals.org/content/27/12/1603.full > > Perhaps wrapping parts of the MEME suite into an R library would be > useful, no? > > You might find the FIRE (and FIRE-pro) suite of tools also useful for > motif discovery, as welll: > > http://physiology.med.cornell.edu/faculty/elemento/lab/software.shtml > > Related to that, S. Tavazoie gave a talk at the recent CSHL/sysbio > meeting and presented TEISER, which seems pretty cool if you're > looking for structural motifs: > > https://tavazoielab.c2b2.columbia.edu/TEISER/ > > -steve > > On Wed, Apr 25, 2012 at 9:44 AM, Zhu, Lihua (Julie) > <julie.zhu@umassmed.edu> wrote: > > Paul, > > > > Thanks for the positive feedback on FlyFactorSurvey! The motifs in this > > database are generated using the bacterial one-hybrid method (B1H and > > B1H-seq). All the public motifs can be downloaded freely. It would be > useful > > to have a Bioc data package, containing curated and current motifs from > all > > organisms if available, that interfaces with MotiV. > > > > MEME works very well in finding motifs from B1H-seq data (Christensen et > > al.,Nucleic Acid Research 2011, Vol39, No.12 e83), although only limited > > motif discovery tools were compared in the paper. Currently, we are > working > > on whether motif discovery can be improved with B1H-seq data. > > > > As I understand, MEME is for de nova motif discovery, TOMTOM and STAMP > are > > for testing whether the motif returned by a motif finder is significantly > > similar to a known motif, clover is for searching known motifs in a given > > set of sequences. We are thinking of adding clover to our website. > > > > I am looking forward to your collated survey results. > > > > Best regards, > > > > Julie > > > > > > On 4/24/12 11:02 PM, "Paul Shannon" <pshannon@fhcrc.org> wrote: > > > >> Hi Julie, > >> > >> FlyFactorSurvey looks great. Would that we had such a resource > (curated, > >> current, and growing) for all organisms! > >> > >> A few questions, if I may: > >> > >> 1) What role with respect to FlyFactorSurvey do you picture us taking > here > >> at BioC? How can we help? > >> > >> 2) Your website (http://pgfe.umassmed.edu/TFDBS) recommends meme and > TOMTOM > >> for motif comparison. Do you use them yourself? If so, can you tell > us about > >> their strengths and weaknesses? How do they compare to clover? > >> (http://zlab.bu.edu/clover/) > >> > >> In that same spirit -- trying to find out more about this topic -- here > are > >> some more questions: > >> > >> 3) The JASPAR database seems to be mostly unchanged since 2009. > >> (http://jaspar.genereg.net/html/DOWNLOAD). Does anyone know their > update > >> policy? > >> > >> 4) Is TRANSFAC only for license holders? > >> > >> 5) Are there any other organism-specific gems like FlyFactorSurvey to > be > >> discovered out on the web? > >> > >> Thanks! > >> > >> - Paul > >> > >> On Apr 24, 2012, at 3:16 PM, Zhu, Lihua (Julie) wrote: > >> > >>> Paul, > >>> > >>> Thanks so much for the comprehensive summary of existing capability of > Bioc > >>> and other resources for motif discovery and matching! > >>> > >>> Here is my response to your great initiative to collect use cases and > open > >>> data resources. > >>> > >>> Here is an open data source for Drosophila which we developed: > >>> http://pgfe.umassmed.edu/TFDBS/ > >>> http://nar.oxfordjournals.org/content/early/2010/11/19/nar.gkq858.full > >>> > >>> As you pointed out, there are several excellent Bioconductor packages > >>> available for the two common cases of motif problems, i.e., de nova > motif > >>> discovery and motif matching to known motifs. It would be useful to > have > >>> more motif databases available for motif comparison program such as > MotIV. > >>> In addition, we use clover to search for known motifs in a given set of > >>> sequences. > >>> > >>> Many thanks for sharing your insights! > >>> > >>> Best regards, > >>> > >>> Julie > >>> > >>> > >>> On 4/24/12 3:02 PM, "Paul Shannon" <pshannon@fhcrc.org> wrote: > >>> > >>>> The recent flurry of interest in sequence motifs here on the bioc list > >>>> suggests to us that maybe we at Bioconductor could strengthen our > >>>> infrastructure for this kind of work. If this work interests you -- > either > >>>> as > >>>> a package creator, or as a package user -- please suggest ideas or use > >>>> cases. > >>>> What do you need? I will collect and collate the responses. We > hope to > >>>> identify places where Bioc can help out. > >>>> > >>>> For background: we already have a number of packages (rGADEM, MotIV, > cosmo, > >>>> BCRANK, motifRG) which address, with different strengths, what I > believe to > >>>> be > >>>> the two aspects of the motif problem: > >>>> > >>>> 1) Detecting enriched motifs in DNA sequence, or in ChIP-seq data > (rGADEM, > >>>> cosmo, motifRG, BCRANK) > >>>> 2) Predicting the sequence motifs which bind to these enriched > motifs, and > >>>> what binding molecules they belong to (MotIV) > >>>> > >>>> In the past, a lot of sequence motif/binding work has addressed the > search > >>>> for > >>>> transcription factor binding sites and their cognate transcription > factors. > >>>> miRNAs, phorphorylation and methylation all pose related problems. > Is there > >>>> support which we can practically offer here as well? > >>>> > >>>> In addition to Bioc packages, there are of course many worthwhile > websites > >>>> and > >>>> external tools: JASPAR, meme, STAMP (and TRANSFAC, for those with a > >>>> license). > >>>> Nooshin mentioned the arabidopsis-specific 'AthaMap' > >>>> (http://www.athamap.de). > >>>> Are there other open-source data repositories like this for other > organisms? > >>>> c.elegans, as Julie requested? > >>>> > >>>> Questions, suggestions, use cases and data sources are all welcome. > >>>> > >>>> Thanks! > >>>> > >>>> - Paul > >>>> > >>>> > >>>> > >>>> > >>>> On Apr 24, 2012, at 10:47 AM, Zhu, Lihua (Julie) wrote: > >>>> > >>>>> Eloi, > >>>>> > >>>>> I would like to use MotIV for a c.elegans dataset. What data source > would > >>>>> you recommend for matchMotif? Many thanks for your help! > >>>>> > >>>>> Best regards, > >>>>> > >>>>> Julie > >>>>> > >>>>> > >>>>> On 4/24/12 1:28 PM, "Mercier Eloi" <emercier@chibi.ubc.ca> wrote: > >>>>> > >>>>>> Hello, > >>>>>> > >>>>>> I am one of the developer of MotIV. I will be happy to help you if > you > >>>>>> have any question regarding the package. > >>>>>> > >>>>>> First, I want to mention that in the Plos One paper, we used PICS, > >>>>>> rGADEM and MotIV as a pipeline but MotIV can be use as a stand > alone. > >>>>>> Some of the advanced functions won't be available though. > >>>>>> > >>>>>> Since the PWMs in MotIV correspond to human TF, you may have to use > your > >>>>>> own list of PWMs. What MotIV needs is a simple list of matrices > >>>>>> (head(jaspar) to view the format). > >>>>>> Jaspar's PWMs can be easily downloaded but it seems it only > contains ~20 > >>>>>> motifs. On the other hand, AthaMap has more motifs but I did not > manage > >>>>>> to find an easy way to get them. Another place to look at is the > AGRIS > >>>>>> website (http://arabidopsis.med.ohio- state.edu/downloads.html) > >>>>>> > >>>>>> If you're only interested by the identification of the motifs and > do not > >>>>>> want to do further analysis with R, I recommend you to look at > >>>>>> http://www.benoslab.pitt.edu/stamp for the identification of your > motifs. > >>>>>> > >>>>>> Regards, > >>>>>> > >>>>>> Eloi Mercier > >>>>>> > >>>>>> > >>>>>> On 12-04-24 07:36 AM, nooshin wrote: > >>>>>>> Thanks a lot for your suggestion. I will for sure have a look and > inform > >>>>>>> you. > >>>>>>> Bests, > >>>>>>> Nooshin > >>>>>>> > >>>>>>> > >>>>>>> On 04/24/2012 04:15 PM, Tim Triche, Jr. wrote: > >>>>>>>> Ah, I see. GSL is a useful library to have installed regardless. > >>>>>>>> Hope things work out. I found your exchanges with Paul to be > useful > >>>>>>>> reading, but obviously I was not reading closely enough, since > Paul > >>>>>>>> started off his code sample with biocLite('MotIV'). Oops :-o > >>>>>>>> > >>>>>>>> Here is a paper that I found interesting, which does go into some > >>>>>>>> detail towards a "bulk" approach, from Gottardo's group: > >>>>>>>> > >>>>>>>> > http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0 0164 > >>>>>>>> 32 > >>>>>> > >>>>>>>> Perhaps it will be useful to you as well, would be curious to > hear if > >>>>>>>> so. > >>>>>>>> > >>>>>>>> --t > >>>>>>>> > >>>>>>>> On Tue, Apr 24, 2012 at 7:00 AM, nooshin<n_omranian@yahoo.com> >>>>>>>> <mailto:n_omranian@yahoo.com>> wrote: > >>>>>>>> > >>>>>>>> > >>>>>>>> Thanks, it's been already solved, it needs GSL package, which > is a > >>>>>>>> bit problematic, but I solved it already. > >>>>>>>> > >>>>>>>> But it does include only 5 matrices (in the webpage) for > >>>>>>>> arabidopsis and in the package also! > >>>>>>>> I'm downloading manually from AthaMap! > >>>>>>>> > >>>>>>>> Thanks again and keep waiting for 'bulk' approach. > >>>>>>>> > >>>>>>>> Bests, > >>>>>>>> Nooshin > >>>>>>>> > >>>>>>>> > >>>>>>>> On 04/24/2012 03:16 PM, Tim Triche, Jr. wrote: > >>>>>>>>> source("http://bioconductor.org/biocLite.R") > >>>>>>>>> biocLite("MotIV") > >>>>>>>>> > >>>>>>>>> ought to do the trick for you > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On Tue, Apr 24, 2012 at 1:01 AM, nooshin<n_omranian@yahoo.com> >>>>>>>>> <mailto:n_omranian@yahoo.com>> wrote: > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Hi Paul, > >>>>>>>>> > >>>>>>>>> Thanks a lot. > >>>>>>>>> I forgot to include bioc, since I only replied to you (no > to > >>>>>>>>> all). > >>>>>>>>> > >>>>>>>>> I can"t install MotIV package to check. I checked in > google but > >>>>>>>>> I > >>>>>>>>> couldn't find any solution! Do you have any suggestion for > >>>>>>>>> installing > >>>>>>>>> this package? > >>>>>>>>> > >>>>>>>>> Bests, > >>>>>>>>> Nooshin > >>>>>>>>> > >>>>>>>>> On 04/23/2012 06:35 PM, Paul Shannon wrote: > >>>>>>>>>> (redirecting this back to the Bioc list...) > >>>>>>>>>> > >>>>>>>>>> Hi Nooshin, > >>>>>>>>>> > >>>>>>>>>> The 'bulk' approach is not quite so ready as I predicted. > >>>>>>>>> I might have something by the end of the week. > >>>>>>>>>> > >>>>>>>>>> As for mapping between PWMs and TFs, I have most often done > >>>>>>>>> this with 'tom-tom' from the meme website. > >>>>>>>>>> > >>>>>>>>>> But I just discovered what looks like a good -- maybe > >>>>>>>>> better -- approach: the Bioconductor MotIV package, which > >>>>>>>>> includes a 2010 version of jasper. > >>>>>>>>>> Try this: > >>>>>>>>>> > >>>>>>>>>> source("http://bioconductor.org/biocLite.R") > >>>>>>>>>> > >>>>>>>>>> biocLite ('MotIV') > >>>>>>>>>> library (MotIV); > >>>>>>>>>> browseVignettes ('MotIV') > >>>>>>>>>> > >>>>>>>>>> The jaspar data in this package has 130 TF-PWM mappings, > >>>>>>>>> which appear to be human. More must be known, and > publicly > >>>>>>>>> available. The JASPAR website has a 'JASPAR CORE Plantae' > >>>>>>>>> data set that > >>>>>>>>>> - is probably what you are interested in > >>>>>>>>>> - might be downloadable, and convertible to the form > >>>>>>>>> MotIV wants. > >>>>>>>>>> > >>>>>>>>>> Perhaps other readers of the list have other suggestions. > >>>>>>>>>> > >>>>>>>>>> If you have any questions on this, please include 'BioC' in > >>>>>>>>> your reply, so that we can all get better at this! > >>>>>>>>>> > >>>>>>>>>> - Paul > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On Apr 23, 2012, at 6:53 AM, nooshin wrote: > >>>>>>>>>> > >>>>>>>>>>> Hi Paul, > >>>>>>>>>>> > >>>>>>>>>>> Many thanks for your comprehensive information and code! > >>>>>>>>>>> I have a question regarding to extract of PWMs. How and > >>>>>>>>> where I can download these matrices for all TFs that PWM > is > >>>>>>>>> available for them? I need it only for Arabidopsis > thaliana. > >>>>>>>>>>> Is there any package in R which I can give the TF and > >>>>>>>>> receive the PWM for it? Or any online database which I can > >>>>>>>>> download from it? I have a big problem since Friday to > find > >>>>>>>>> out these matrices for different TFs of A.th. That would > be > >>>>>>>>> so great if you can help me to get these matrices. > >>>>>>>>>>> > >>>>>>>>>>>> If you want to do this in bulk, Herve' has some lovely > >>>>>>>>> code to make that efficient. > >>>>>>>>>>> Also can I have this? :) > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> Thanks a lot in advance. > >>>>>>>>>>> Best regards, > >>>>>>>>>>> Nooshin > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>> > >>>>>>>>> *TODAY*/(Beta) /*.*Powered by Yahoo! > >>>>>>>>> > >>>>>>>>> Armored catfish wreak havoc in U.S. South > >>>>>>>>> > >>>>>>>>> < > http://news.yahoo.com/blogs/sideshow/armored-catfish-wreaking-havoc- so > >>>>>>>>> ut > >>>>>>>>> h- > >>>>>>>>> > florida- lakes-182812663.html;_ylc=X3oDMTFia2oyNjZoBF9TAzk1NDAxMDAyNwRwa > >>>>>>>>> 2c > >>>>>>>>> Da > >>>>>>>>> WQtMjIzODM5NARzeWlkA2RfZWNoMGQ4MGQ-#more-4190> > >>>>>>>>> > >>>>>>>>> Privacy Policy > >>>>>>>>> < > http://info.yahoo.com/privacy/us/yahoo/webbeacons/details.html> > >>>>>>>>> > >>>>>>>>> [[alternative HTML version deleted]] > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> _______________________________________________ > >>>>>>>>> Bioconductor mailing list > >>>>>>>>> Bioconductor@r-project.org<mailto:> Bioconductor@r-project.org> > >>>>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor > >>>>>>>>> Search the archives: > >>>>>>>>> > >>>>>>>>> > http://news.gmane.org/gmane.science.biology.informatics.conductor > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> -- > >>>>>>>>> /A model is a lie that helps you see the truth./ > >>>>>>>>> / > >>>>>>>>> / > >>>>>>>>> Howard Skipper > >>>>>>>>> <http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> > > >>>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> -- > >>>>>>>> /A model is a lie that helps you see the truth./ > >>>>>>>> / > >>>>>>>> / > >>>>>>>> Howard Skipper > >>>>>>>> <http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> > >>>>>>>> > >>>>>>> > >>>>>>> [[alternative HTML version deleted]] > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> _______________________________________________ > >>>>>>> Bioconductor mailing list > >>>>>>> Bioconductor@r-project.org > >>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor > >>>>>>> Search the archives: > >>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor > >>>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> Bioconductor mailing list > >>>>> Bioconductor@r-project.org > >>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor > >>>>> Search the archives: > >>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor > >>>> > >>> > >>> > >> > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@r-project.org > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > -- > Steve Lianoglou > Graduate Student: Computational Systems Biology > | Memorial Sloan-Kettering Cancer Center > | Weill Medical College of Cornell University > Contact Info: http://cbio.mskcc.org/~lianos/contact > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hi, We are the developers of the de novo motif discovery package rGADEM. While rGADEM fits nicely in the ChIP-Seq analysis pipeline PICS/rGADEM/MotIV, it can also easily be used to analyze enriched regions obtained by any other peak callers. Among the many output of rGADEM, we find the position weight matrices (pwn) that can be directly used for analysis in MotIV, but also other programs that compare motif sequences with databases. Since it is implemented using OpenMP, the time for each analysis required for every analysis is greatly reduced. We would happily consider collaborating with you if you are interested to test our de novo motif discovery package with your dataset. Regards, Arnaud. Le 2012-04-25 à 10:12, Steve Lianoglou a écrit : > Hi, > > To carry on the MEME stuff, a biostar post just pointed me to an > updated scoring metric in tomtom which is made available in the latest > MEME software suite: > > http://bioinformatics.oxfordjournals.org/content/27/12/1603.full > > Perhaps wrapping parts of the MEME suite into an R library would be useful, no? > > You might find the FIRE (and FIRE-pro) suite of tools also useful for > motif discovery, as welll: > > http://physiology.med.cornell.edu/faculty/elemento/lab/software.shtml > > Related to that, S. Tavazoie gave a talk at the recent CSHL/sysbio > meeting and presented TEISER, which seems pretty cool if you're > looking for structural motifs: > > https://tavazoielab.c2b2.columbia.edu/TEISER/ > > -steve > > On Wed, Apr 25, 2012 at 9:44 AM, Zhu, Lihua (Julie) > <julie.zhu@umassmed.edu> wrote: >> Paul, >> >> Thanks for the positive feedback on FlyFactorSurvey! The motifs in this >> database are generated using the bacterial one-hybrid method (B1H and >> B1H-seq). All the public motifs can be downloaded freely. It would be useful >> to have a Bioc data package, containing curated and current motifs from all >> organisms if available, that interfaces with MotiV. >> >> MEME works very well in finding motifs from B1H-seq data (Christensen et >> al.,Nucleic Acid Research 2011, Vol39, No.12 e83), although only limited >> motif discovery tools were compared in the paper. Currently, we are working >> on whether motif discovery can be improved with B1H-seq data. >> >> As I understand, MEME is for de nova motif discovery, TOMTOM and STAMP are >> for testing whether the motif returned by a motif finder is significantly >> similar to a known motif, clover is for searching known motifs in a given >> set of sequences. We are thinking of adding clover to our website. >> >> I am looking forward to your collated survey results. >> >> Best regards, >> >> Julie >> >> >> On 4/24/12 11:02 PM, "Paul Shannon" <pshannon@fhcrc.org> wrote: >> >>> Hi Julie, >>> >>> FlyFactorSurvey looks great. Would that we had such a resource (curated, >>> current, and growing) for all organisms! >>> >>> A few questions, if I may: >>> >>> 1) What role with respect to FlyFactorSurvey do you picture us taking here >>> at BioC? How can we help? >>> >>> 2) Your website (http://pgfe.umassmed.edu/TFDBS) recommends meme and TOMTOM >>> for motif comparison. Do you use them yourself? If so, can you tell us about >>> their strengths and weaknesses? How do they compare to clover? >>> (http://zlab.bu.edu/clover/) >>> >>> In that same spirit -- trying to find out more about this topic -- here are >>> some more questions: >>> >>> 3) The JASPAR database seems to be mostly unchanged since 2009. >>> (http://jaspar.genereg.net/html/DOWNLOAD). Does anyone know their update >>> policy? >>> >>> 4) Is TRANSFAC only for license holders? >>> >>> 5) Are there any other organism-specific gems like FlyFactorSurvey to be >>> discovered out on the web? >>> >>> Thanks! >>> >>> - Paul >>> >>> On Apr 24, 2012, at 3:16 PM, Zhu, Lihua (Julie) wrote: >>> >>>> Paul, >>>> >>>> Thanks so much for the comprehensive summary of existing capability of Bioc >>>> and other resources for motif discovery and matching! >>>> >>>> Here is my response to your great initiative to collect use cases and open >>>> data resources. >>>> >>>> Here is an open data source for Drosophila which we developed: >>>> http://pgfe.umassmed.edu/TFDBS/ >>>> http://nar.oxfordjournals.org/content/early/2010/11/19/nar.gkq858.full >>>> >>>> As you pointed out, there are several excellent Bioconductor packages >>>> available for the two common cases of motif problems, i.e., de nova motif >>>> discovery and motif matching to known motifs. It would be useful to have >>>> more motif databases available for motif comparison program such as MotIV. >>>> In addition, we use clover to search for known motifs in a given set of >>>> sequences. >>>> >>>> Many thanks for sharing your insights! >>>> >>>> Best regards, >>>> >>>> Julie >>>> >>>> >>>> On 4/24/12 3:02 PM, "Paul Shannon" <pshannon@fhcrc.org> wrote: >>>> >>>>> The recent flurry of interest in sequence motifs here on the bioc list >>>>> suggests to us that maybe we at Bioconductor could strengthen our >>>>> infrastructure for this kind of work. If this work interests you -- either >>>>> as >>>>> a package creator, or as a package user -- please suggest ideas or use >>>>> cases. >>>>> What do you need? I will collect and collate the responses. We hope to >>>>> identify places where Bioc can help out. >>>>> >>>>> For background: we already have a number of packages (rGADEM, MotIV, cosmo, >>>>> BCRANK, motifRG) which address, with different strengths, what I believe to >>>>> be >>>>> the two aspects of the motif problem: >>>>> >>>>> 1) Detecting enriched motifs in DNA sequence, or in ChIP-seq data (rGADEM, >>>>> cosmo, motifRG, BCRANK) >>>>> 2) Predicting the sequence motifs which bind to these enriched motifs, and >>>>> what binding molecules they belong to (MotIV) >>>>> >>>>> In the past, a lot of sequence motif/binding work has addressed the search >>>>> for >>>>> transcription factor binding sites and their cognate transcription factors. >>>>> miRNAs, phorphorylation and methylation all pose related problems. Is there >>>>> support which we can practically offer here as well? >>>>> >>>>> In addition to Bioc packages, there are of course many worthwhile websites >>>>> and >>>>> external tools: JASPAR, meme, STAMP (and TRANSFAC, for those with a >>>>> license). >>>>> Nooshin mentioned the arabidopsis-specific 'AthaMap' >>>>> (http://www.athamap.de). >>>>> Are there other open-source data repositories like this for other organisms? >>>>> c.elegans, as Julie requested? >>>>> >>>>> Questions, suggestions, use cases and data sources are all welcome. >>>>> >>>>> Thanks! >>>>> >>>>> - Paul >>>>> >>>>> >>>>> >>>>> >>>>> On Apr 24, 2012, at 10:47 AM, Zhu, Lihua (Julie) wrote: >>>>> >>>>>> Eloi, >>>>>> >>>>>> I would like to use MotIV for a c.elegans dataset. What data source would >>>>>> you recommend for matchMotif? Many thanks for your help! >>>>>> >>>>>> Best regards, >>>>>> >>>>>> Julie >>>>>> >>>>>> >>>>>> On 4/24/12 1:28 PM, "Mercier Eloi" <emercier@chibi.ubc.ca> wrote: >>>>>> >>>>>>> Hello, >>>>>>> >>>>>>> I am one of the developer of MotIV. I will be happy to help you if you >>>>>>> have any question regarding the package. >>>>>>> >>>>>>> First, I want to mention that in the Plos One paper, we used PICS, >>>>>>> rGADEM and MotIV as a pipeline but MotIV can be use as a stand alone. >>>>>>> Some of the advanced functions won't be available though. >>>>>>> >>>>>>> Since the PWMs in MotIV correspond to human TF, you may have to use your >>>>>>> own list of PWMs. What MotIV needs is a simple list of matrices >>>>>>> (head(jaspar) to view the format). >>>>>>> Jaspar's PWMs can be easily downloaded but it seems it only contains ~20 >>>>>>> motifs. On the other hand, AthaMap has more motifs but I did not manage >>>>>>> to find an easy way to get them. Another place to look at is the AGRIS >>>>>>> website (http://arabidopsis.med.ohio- state.edu/downloads.html) >>>>>>> >>>>>>> If you're only interested by the identification of the motifs and do not >>>>>>> want to do further analysis with R, I recommend you to look at >>>>>>> http://www.benoslab.pitt.edu/stamp for the identification of your motifs. >>>>>>> >>>>>>> Regards, >>>>>>> >>>>>>> Eloi Mercier >>>>>>> >>>>>>> >>>>>>> On 12-04-24 07:36 AM, nooshin wrote: >>>>>>>> Thanks a lot for your suggestion. I will for sure have a look and inform >>>>>>>> you. >>>>>>>> Bests, >>>>>>>> Nooshin >>>>>>>> >>>>>>>> >>>>>>>> On 04/24/2012 04:15 PM, Tim Triche, Jr. wrote: >>>>>>>>> Ah, I see. GSL is a useful library to have installed regardless. >>>>>>>>> Hope things work out. I found your exchanges with Paul to be useful >>>>>>>>> reading, but obviously I was not reading closely enough, since Paul >>>>>>>>> started off his code sample with biocLite('MotIV'). Oops :-o >>>>>>>>> >>>>>>>>> Here is a paper that I found interesting, which does go into some >>>>>>>>> detail towards a "bulk" approach, from Gottardo's group: >>>>>>>>> >>>>>>>>> http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjourna l.pone.00164 >>>>>>>>> 32 >>>>>>> >>>>>>>>> Perhaps it will be useful to you as well, would be curious to hear if >>>>>>>>> so. >>>>>>>>> >>>>>>>>> --t >>>>>>>>> >>>>>>>>> On Tue, Apr 24, 2012 at 7:00 AM, nooshin<n_omranian@yahoo.com>>>>>>>>> <mailto:n_omranian@yahoo.com>> wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, it's been already solved, it needs GSL package, which is a >>>>>>>>> bit problematic, but I solved it already. >>>>>>>>> >>>>>>>>> But it does include only 5 matrices (in the webpage) for >>>>>>>>> arabidopsis and in the package also! >>>>>>>>> I'm downloading manually from AthaMap! >>>>>>>>> >>>>>>>>> Thanks again and keep waiting for 'bulk' approach. >>>>>>>>> >>>>>>>>> Bests, >>>>>>>>> Nooshin >>>>>>>>> >>>>>>>>> >>>>>>>>> On 04/24/2012 03:16 PM, Tim Triche, Jr. wrote: >>>>>>>>>> source("http://bioconductor.org/biocLite.R") >>>>>>>>>> biocLite("MotIV") >>>>>>>>>> >>>>>>>>>> ought to do the trick for you >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Tue, Apr 24, 2012 at 1:01 AM, nooshin<n_omranian@yahoo.com>>>>>>>>>> <mailto:n_omranian@yahoo.com>> wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Hi Paul, >>>>>>>>>> >>>>>>>>>> Thanks a lot. >>>>>>>>>> I forgot to include bioc, since I only replied to you (no to >>>>>>>>>> all). >>>>>>>>>> >>>>>>>>>> I can"t install MotIV package to check. I checked in google but >>>>>>>>>> I >>>>>>>>>> couldn't find any solution! Do you have any suggestion for >>>>>>>>>> installing >>>>>>>>>> this package? >>>>>>>>>> >>>>>>>>>> Bests, >>>>>>>>>> Nooshin >>>>>>>>>> >>>>>>>>>> On 04/23/2012 06:35 PM, Paul Shannon wrote: >>>>>>>>>>> (redirecting this back to the Bioc list...) >>>>>>>>>>> >>>>>>>>>>> Hi Nooshin, >>>>>>>>>>> >>>>>>>>>>> The 'bulk' approach is not quite so ready as I predicted. >>>>>>>>>> I might have something by the end of the week. >>>>>>>>>>> >>>>>>>>>>> As for mapping between PWMs and TFs, I have most often done >>>>>>>>>> this with 'tom-tom' from the meme website. >>>>>>>>>>> >>>>>>>>>>> But I just discovered what looks like a good -- maybe >>>>>>>>>> better -- approach: the Bioconductor MotIV package, which >>>>>>>>>> includes a 2010 version of jasper. >>>>>>>>>>> Try this: >>>>>>>>>>> >>>>>>>>>>> source("http://bioconductor.org/biocLite.R") >>>>>>>>>>> >>>>>>>>>>> biocLite ('MotIV') >>>>>>>>>>> library (MotIV); >>>>>>>>>>> browseVignettes ('MotIV') >>>>>>>>>>> >>>>>>>>>>> The jaspar data in this package has 130 TF-PWM mappings, >>>>>>>>>> which appear to be human. More must be known, and publicly >>>>>>>>>> available. The JASPAR website has a 'JASPAR CORE Plantae' >>>>>>>>>> data set that >>>>>>>>>>> - is probably what you are interested in >>>>>>>>>>> - might be downloadable, and convertible to the form >>>>>>>>>> MotIV wants. >>>>>>>>>>> >>>>>>>>>>> Perhaps other readers of the list have other suggestions. >>>>>>>>>>> >>>>>>>>>>> If you have any questions on this, please include 'BioC' in >>>>>>>>>> your reply, so that we can all get better at this! >>>>>>>>>>> >>>>>>>>>>> - Paul >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Apr 23, 2012, at 6:53 AM, nooshin wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi Paul, >>>>>>>>>>>> >>>>>>>>>>>> Many thanks for your comprehensive information and code! >>>>>>>>>>>> I have a question regarding to extract of PWMs. How and >>>>>>>>>> where I can download these matrices for all TFs that PWM is >>>>>>>>>> available for them? I need it only for Arabidopsis thaliana. >>>>>>>>>>>> Is there any package in R which I can give the TF and >>>>>>>>>> receive the PWM for it? Or any online database which I can >>>>>>>>>> download from it? I have a big problem since Friday to find >>>>>>>>>> out these matrices for different TFs of A.th. That would be >>>>>>>>>> so great if you can help me to get these matrices. >>>>>>>>>>>> >>>>>>>>>>>>> If you want to do this in bulk, Herve' has some lovely >>>>>>>>>> code to make that efficient. >>>>>>>>>>>> Also can I have this? :) >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks a lot in advance. >>>>>>>>>>>> Best regards, >>>>>>>>>>>> Nooshin >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> *TODAY*/(Beta) /*.*Powered by Yahoo! >>>>>>>>>> >>>>>>>>>> Armored catfish wreak havoc in U.S. South >>>>>>>>>> >>>>>>>>>> <http: news.yahoo.com="" blogs="" sideshow="" armored-catfish-="" wreaking-havoc-so="">>>>>>>>>> ut >>>>>>>>>> h- >>>>>>>>>> florida- lakes-182812663.html;_ylc=X3oDMTFia2oyNjZoBF9TAzk1NDAxMDAyNwRwa >>>>>>>>>> 2c >>>>>>>>>> Da >>>>>>>>>> WQtMjIzODM5NARzeWlkA2RfZWNoMGQ4MGQ-#more-4190> >>>>>>>>>> >>>>>>>>>> Privacy Policy >>>>>>>>>> <http: info.yahoo.com="" privacy="" us="" yahoo="" webbeacons="" details.html=""> >>>>>>>>>> >>>>>>>>>> [[alternative HTML version deleted]] >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> Bioconductor mailing list >>>>>>>>>> Bioconductor@r-project.org<mailto:bioconductor@r-project.org> >>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>>>>>> Search the archives: >>>>>>>>>> >>>>>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> /A model is a lie that helps you see the truth./ >>>>>>>>>> / >>>>>>>>>> / >>>>>>>>>> Howard Skipper >>>>>>>>>> <http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> /A model is a lie that helps you see the truth./ >>>>>>>>> / >>>>>>>>> / >>>>>>>>> Howard Skipper >>>>>>>>> <http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> >>>>>>>>> >>>>>>>> >>>>>>>> [[alternative HTML version deleted]] >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Bioconductor mailing list >>>>>>>> Bioconductor@r-project.org >>>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>>>> Search the archives: >>>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Bioconductor mailing list >>>>>> Bioconductor@r-project.org >>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>> Search the archives: >>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>> >>>> >>>> >>> >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > -- > Steve Lianoglou > Graduate Student: Computational Systems Biology > | Memorial Sloan-Kettering Cancer Center > | Weill Medical College of Cornell University > Contact Info: http://cbio.mskcc.org/~lianos/contact > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hello, Many thanks for the reply. Actually, I'm working with Arabidopsis thaliana, and this is my first time to do some sequence-based analysis. I have a list of TFs, and some clusters of genes, I need to check which TFs' binding site(motifs) are enriched in these clusters. The other analysis is that, I have a TF in Arabidopsis which includes two part (RAV1) and I need to check if the motif of this specific TF is enriched in any of the genes' sequences (500~1000-bps) in these clusters. I need to know which packages in R is helpful to use? since it's really my first experience of the sequence-based analysis. I took all information in AthaMap manually and restructured them, both matrix-based and pattern-based. I sent them an email and they've not supplied me with any data yet, so I did it manually! I also checked AGRIS, but I couldn't find any matrix-based information there :( I used stamp, but with the STAMP there is no possibility to give a specific motif and search for its enrichment in sequences! Also, as you mentioned, I need to do further analysis with R after identification of the motifs, so I prefer to use R instead of using online tools. I also checked MEME (MAST and MEME-ChIP), which I could get the result which I need, but I need to work with R. Many thanks all. Bests, Nooshin On 04/24/2012 07:28 PM, Mercier Eloi wrote: > Hello, > > I am one of the developer of MotIV. I will be happy to help you if you > have any question regarding the package. > > First, I want to mention that in the Plos One paper, we used PICS, > rGADEM and MotIV as a pipeline but MotIV can be use as a stand alone. > Some of the advanced functions won't be available though. > > Since the PWMs in MotIV correspond to human TF, you may have to use > your own list of PWMs. What MotIV needs is a simple list of matrices > (head(jaspar) to view the format). > Jaspar's PWMs can be easily downloaded but it seems it only contains > ~20 motifs. On the other hand, AthaMap has more motifs but I did not > manage to find an easy way to get them. Another place to look at is > the AGRIS website (http://arabidopsis.med.ohio- state.edu/downloads.html) > > If you're only interested by the identification of the motifs and do > not want to do further analysis with R, I recommend you to look at > http://www.benoslab.pitt.edu/stamp for the identification of your motifs. > > Regards, > > Eloi Mercier > > > On 12-04-24 07:36 AM, nooshin wrote: >> Thanks a lot for your suggestion. I will for sure have a look and inform >> you. >> Bests, >> Nooshin >> >> >> On 04/24/2012 04:15 PM, Tim Triche, Jr. wrote: >>> Ah, I see. GSL is a useful library to have installed regardless. >>> Hope things work out. I found your exchanges with Paul to be useful >>> reading, but obviously I was not reading closely enough, since Paul >>> started off his code sample with biocLite('MotIV'). Oops :-o >>> >>> Here is a paper that I found interesting, which does go into some >>> detail towards a "bulk" approach, from Gottardo's group: >>> >>> http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone .0016432 > >>> Perhaps it will be useful to you as well, would be curious to hear if so. >>> >>> --t >>> >>> On Tue, Apr 24, 2012 at 7:00 AM, nooshin<n_omranian@yahoo.com>>> <mailto:n_omranian@yahoo.com>> wrote: >>> >>> >>> Thanks, it's been already solved, it needs GSL package, which is a >>> bit problematic, but I solved it already. >>> >>> But it does include only 5 matrices (in the webpage) for >>> arabidopsis and in the package also! >>> I'm downloading manually from AthaMap! >>> >>> Thanks again and keep waiting for 'bulk' approach. >>> >>> Bests, >>> Nooshin >>> >>> >>> On 04/24/2012 03:16 PM, Tim Triche, Jr. wrote: >>>> source("http://bioconductor.org/biocLite.R") >>>> biocLite("MotIV") >>>> >>>> ought to do the trick for you >>>> >>>> >>>> >>>> On Tue, Apr 24, 2012 at 1:01 AM, nooshin<n_omranian@yahoo.com>>>> <mailto:n_omranian@yahoo.com>> wrote: >>>> >>>> >>>> Hi Paul, >>>> >>>> Thanks a lot. >>>> I forgot to include bioc, since I only replied to you (no to >>>> all). >>>> >>>> I can"t install MotIV package to check. I checked in google but I >>>> couldn't find any solution! Do you have any suggestion for >>>> installing >>>> this package? >>>> >>>> Bests, >>>> Nooshin >>>> >>>> On 04/23/2012 06:35 PM, Paul Shannon wrote: >>>> > (redirecting this back to the Bioc list...) >>>> > >>>> > Hi Nooshin, >>>> > >>>> > The 'bulk' approach is not quite so ready as I predicted. >>>> I might have something by the end of the week. >>>> > >>>> > As for mapping between PWMs and TFs, I have most often done >>>> this with 'tom-tom' from the meme website. >>>> > >>>> > But I just discovered what looks like a good -- maybe >>>> better -- approach: the Bioconductor MotIV package, which >>>> includes a 2010 version of jasper. >>>> > Try this: >>>> > >>>> > source("http://bioconductor.org/biocLite.R") >>>> > >>>> > biocLite ('MotIV') >>>> > library (MotIV); >>>> > browseVignettes ('MotIV') >>>> > >>>> > The jaspar data in this package has 130 TF-PWM mappings, >>>> which appear to be human. More must be known, and publicly >>>> available. The JASPAR website has a 'JASPAR CORE Plantae' >>>> data set that >>>> > - is probably what you are interested in >>>> > - might be downloadable, and convertible to the form >>>> MotIV wants. >>>> > >>>> > Perhaps other readers of the list have other suggestions. >>>> > >>>> > If you have any questions on this, please include 'BioC' in >>>> your reply, so that we can all get better at this! >>>> > >>>> > - Paul >>>> > >>>> > >>>> > On Apr 23, 2012, at 6:53 AM, nooshin wrote: >>>> > >>>> >> Hi Paul, >>>> >> >>>> >> Many thanks for your comprehensive information and code! >>>> >> I have a question regarding to extract of PWMs. How and >>>> where I can download these matrices for all TFs that PWM is >>>> available for them? I need it only for Arabidopsis thaliana. >>>> >> Is there any package in R which I can give the TF and >>>> receive the PWM for it? Or any online database which I can >>>> download from it? I have a big problem since Friday to find >>>> out these matrices for different TFs of A.th. That would be >>>> so great if you can help me to get these matrices. >>>> >> >>>> >>> If you want to do this in bulk, Herve' has some lovely >>>> code to make that efficient. >>>> >> Also can I have this? :) >>>> >> >>>> >> >>>> >> Thanks a lot in advance. >>>> >> Best regards, >>>> >> Nooshin >>>> >> >>>> >> >>>> >>>> *TODAY*/(Beta) /*.*Powered by Yahoo! >>>> >>>> Armored catfish wreak havoc in U.S. South >>>> <http: news.yahoo.com="" blogs="" sideshow="" armored-catfish-="" wreaking-havoc-south-florida-lakes-182812663.html;_ylc="X3oDMTFia2oyNjZ" obf9tazk1ndaxmdaynwrwa2cdawqtmjizodm5narzewlka2rfzwnomgq4mgq-#more-419="" 0=""> >>>> >>>> Privacy Policy >>>> <http: info.yahoo.com="" privacy="" us="" yahoo="" webbeacons="" details.html=""> >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor@r-project.org <mailto:bioconductor@r-project.org> >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>> >>>> >>>> >>>> >>>> -- >>>> /A model is a lie that helps you see the truth./ >>>> / >>>> / >>>> Howard Skipper >>>> <http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> >>>> >>> >>> >>> -- >>> /A model is a lie that helps you see the truth./ >>> / >>> / >>> Howard Skipper >>> <http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> >>> >> [[alternative HTML version deleted]] >> >> >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives:http://news.gmane.org/gmane.science.biology.inf ormatics.conductor > > > -- > Eloi Mercier > Bioinformatics PhD Student, UBC > Paul Pavlidis Lab > 2185 East Mall > University of British Columbia > Vancouver BC V6T1Z4 > *TODAY*/(Beta) /*.*Powered by Yahoo! > > Company's wild plan to mine asteroids > <http: us.lrd.yahoo.com="" _ylc="X3oDMTFicjhwcHB0BF9TAzk1NDAxMDAyNwRwa2" cdawqtmji0mduxnwrzewlka2rfzwnomgq4mgq-="" sig="13brru9i5/**http%3A//financ" e.yahoo.com="" blogs="" the-exchange="" asteroid-mining-company-plans-dig-off-="" earth-143729286.html=""> > > Privacy Policy > <http: info.yahoo.com="" privacy="" us="" yahoo="" webbeacons="" details.html=""> > [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 674 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6