Entering edit mode
In regards to Questions, suggestions, use cases and data sources are
all welcome for working with
TF-PWM motifs, my 2c:
For other methods for de novo chip-seq motif finding besides MotIV and
meme, there is a new paper
that describes a fast method for chip-seq sets and has recent
references:
http://www.ncbi.nlm.nih.gov/pubmed/22228832
I like the HOMER suite (perl, not R based), which is very fast, easy
to use and gives reasonable
results:
http://biowhat.ucsd.edu/homer/chipseq/index.html
If bioconductor matrix annotation packages are developed, it would be
good IMO to have:
-obviously all of jaspar
-some sort of phylogenetic grouping, eg vertebrate, plant, in addition
to species based, since
species specific info is usually limited and not usually required.
-It would also be important to include the latest uniprobe matrices:
http://the_brain.bwh.harvard.edu/uniprobe/
Since it seems that Jaspar is slow to update.
-Maybe include the free version of transfac, even if this is terribly
old?
-Unlikely, perhaps, but it would be great if someone would
systematically go through all public
chip-seq datasets and extract top motifs. Or you could grab the Homer
motifs from website above
where they have already done this to a limited extent.
Another area that would be good is to have methods for identifying
statistically significantly
overrepresented known motifs in sets of DNA sequences, compared to
some user chooseable control set
(sequences from control set, all promoter sequences or same genome
randomized in some way). This
has been implemented in user friendly non-R ways many times,
especially for promoter analysis of
differentially expressed gene sets, see eg:
-the homer package above
http://dire.dcode.org/
http://159.149.109.9/pscan/
http://www.dbi.tju.edu/dbi/tools/paint/
clover at
http://biowulf.bu.edu/MotifViz/
http://www.bioinfo.tsinghua.edu.cn/~zhengjsh/OTFBS/
http://www.telis.ucla.edu/index.php?cmd=transfac
http://grenada.lumc.nl/HumaneGenetica/CORE_TF/
Finally, the ability to easily search for a given motif in a DNA
sequence, but to attach a score to
the match like the possum program listed at clover/motifviz above or
like the transfac match
software. This would use some kind test against control sequencesand
currently could be done using
existing bioconductor tools, but it is a common enough use that a
package would be good. The idea
is not to just give a score about how close the PWM is to the
sequence, but also how likely is it to
happen by chance, since many of the PWM's are very sloppy.
Vince
.........
On 4/24/12 11:02 PM, "Paul Shannon" <pshannon at="" fhcrc.org=""> wrote:
> Hi Julie,
>
> FlyFactorSurvey looks great. Would that we had such a resource
(curated,
> current, and growing) for all organisms!
>
> A few questions, if I may:
>
> 1) What role with respect to FlyFactorSurvey do you picture us
taking here
> at BioC? How can we help?
>
> 2) Your website (http://pgfe.umassmed.edu/TFDBS) recommends meme
and TOMTOM
> for motif comparison. Do you use them yourself? If so, can you
tell us about
> their strengths and weaknesses? How do they compare to clover?
> (http://zlab.bu.edu/clover/)
>
> In that same spirit -- trying to find out more about this topic --
here are
> some more questions:
>
> 3) The JASPAR database seems to be mostly unchanged since 2009.
> (http://jaspar.genereg.net/html/DOWNLOAD). Does anyone know
their update
> policy?
>
> 4) Is TRANSFAC only for license holders?
>
> 5) Are there any other organism-specific gems like
FlyFactorSurvey to be
> discovered out on the web?
>
> Thanks!
>
> - Paul
.............