ChIPpeakAnno
0
0
Entering edit mode
Bahar ▴ 10
@bahar-6444
Last seen 9.6 years ago
Paul Shannon <pshannon at="" ...=""> writes: > > I offer below a short example of using rGADEM with MotifDb which you may find useful. > > The example motivates and then demonstrates "seeded" search, in which a candidate TF is provided to > rGADEM. (The rGADEM vignette illustrates the simpler task of de novo motif search. > > - Paul > > On Feb 27, 2013, at 9:03 AM, Zhu, Lihua (Julie) wrote: > > > Jose, > > > > Thanks for your kind word! > > > > You might want to check out rGADEM package for de nova motif discovery. > > > > Best regards, > > > > Julie > > > > library(rGADEM) > library(MotifDb) > library(BSgenome.Hsapiens.UCSC.hg19) > path <- system.file("extdata/Test_100.fasta", package="rGADEM") > sequences <- readDNAStringSet(path, "fasta") > length(sequences) # 49 > > # you can use the UCSC genome browser to BLAT the first 4 sequences, one at a time, > # against human hg19, and see what TFs bind there > # http://genome.ucsc.edu/cgi-bin/hgBlat?command=start > > as.character(substr(sequences[1:4], 1, 80)) > > # Once you blat, click through to view the blat match in the genome browser, > # and enable the "Transcription Factor ChIP-seq from ENCODE" track. > # The FOXA1 Tf has a high score in this region of chr1 > # for most of these sequences > > # obtain the reported motif for this TF from the Bioc MotifDb package > > query(MotifDb, 'foxa1') > > # MotifDb object of length 1 > # | Created from downloaded public sources: 2012-Nov-01 > # | 1 position frequency matrices from 1 source: > # | JASPAR_CORE: 1 > # | 1 organism/s > # | Hsapiens: 1 > # Hsapiens-JASPAR_CORE-FOXA1-MA0148.1 > > # now use rGADEM to do a seeded search: > > pwm.foxa1 <- as.list(MotifDb["Hsapiens-JASPAR_CORE-FOXA1-MA0148.1"]) > gadem.foxa1 <-GADEM(sequences,verbose=1,genome=Hsapiens,Spwm=pwm.foxa1, seed=TRUE) > consensus(gadem.foxa1) > > # [1] "nTGTTTACwyw" "GmwGrrrsswGGvAGn" > > # how do the 49 sequences match to these two motifs? > mean(sapply(gadem.foxa1 <at> motifList[[1]] <at> alignList, function(x) x <at> pval)) > mean(sapply(gadem.foxa1 <at> motifList[[2]] <at> alignList, function(x) x <at> pval)) > > On Feb 27, 2013, at 9:03 AM, Zhu, Lihua (Julie) wrote: > > > Jose, > > > > Thanks for your kind word! > > > > You might want to check out rGADEM package for de nova motif discovery. > > > > Best regards, > > > > Julie > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor at ... > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > Hi Paul, Thank you very much for your example rGADEM and motIV analysis. I am working with a set of sequences in fasta format and looking for potential motif denovo. Once identified, I wanted to verify if those match with available databases (JASPAR etc.) However I fail to get already reported motifs using the same sequences. I copied the script below for your input and suggestions please. library(rGADEM) pwd<-"" #INPUT FILES- BedFiles, FASTA, etc. path<- system.file("extdata/mysequences.fasta",package="rGADEM") FastaFile<-paste(pwd,path,sep="") Sequences <- readDNAStringSet(FastaFile, "fasta") gadem<-GADEM(Sequences, seed=1, verbose=1, numWordGroup=3, numTop3mer=20, numTop4mer=40, numTop5mer=60, numGeneration=5, pValue=0.0002, eValue=0.0, maskR=1, maxSpaceWidth=10, useChIPscore=0, numEM=40, fEM=0.5, fullScan=1, slideWinPWM=6, stopCriterion=1, numBackgSets=10, weightType=0, bFileName="NULL", nmotifs=25, extTrim=1, minSpaceWidth=0, minSites =-1) motifs <- getPWM(gadem) print(motifs) Thank you very much for your comments. Bahar
rGADEM MotIV MotifDb rGADEM MotIV MotifDb • 1.4k views
ADD COMMENT

Login before adding your answer.

Traffic: 572 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6