occurrence of rGADEM motifs
3
0
Entering edit mode
@mattia-pelizzola-3304
Last seen 14 months ago
Italy
Hi, I am using rGADEM and MotIV to find out enriched motifs in my ChIPseq peaks and determine the similarity with Jaspar TFBS. These tools look very useful! rGADEM provides a list of enriched motifs. The total number of motifs is provided by the nOccurrences function, but I can't find a way to get to know which peak regions do contain these motifs. In particular, what are the startPos and endPos functions supposed to do? I would expect a set of genomic positions (or positions relative to the peak regions) with the same length as nOccurrences, but I only get one number for each motif, with no chromosome associated. Even in the rGADEM vignette you have nOccurrences equal to 60 but then you get only one number out of the startPos and endPos functions. Am I missing or misunderstanding anything? Additionally, I was also wondering if it is possible to control the max number of processors used in the analysis. I am working on a cluster shared between many people and apparently the software uses as many processors as possible, while I do not want to be that greedy with other users .. Thanks for any hint, mattia [[alternative HTML version deleted]]
rGADEM MotIV rGADEM MotIV • 1.8k views
ADD COMMENT
0
Entering edit mode
SimonNoël ▴ 450
@simonnoel-3455
Last seen 10.3 years ago
Vous en pensez quoi? En plus, ?a utilise Jaspar, une base de donn?e tr?s performante et compl?te que nous n'utilisons ?s encore. Simon No?l CdeC ________________________________________ De : bioconductor-bounces at r-project.org [bioconductor-bounces at r-project.org] de la part de mattia pelizzola [mattia.pelizzola at gmail.com] Date d'envoi : 26 juillet 2011 06:51 ? : bioconductor Objet : [BioC] occurrence of rGADEM motifs Hi, I am using rGADEM and MotIV to find out enriched motifs in my ChIPseq peaks and determine the similarity with Jaspar TFBS. These tools look very useful! rGADEM provides a list of enriched motifs. The total number of motifs is provided by the nOccurrences function, but I can't find a way to get to know which peak regions do contain these motifs. In particular, what are the startPos and endPos functions supposed to do? I would expect a set of genomic positions (or positions relative to the peak regions) with the same length as nOccurrences, but I only get one number for each motif, with no chromosome associated. Even in the rGADEM vignette you have nOccurrences equal to 60 but then you get only one number out of the startPos and endPos functions. Am I missing or misunderstanding anything? Additionally, I was also wondering if it is possible to control the max number of processors used in the analysis. I am working on a cluster shared between many people and apparently the software uses as many processors as possible, while I do not want to be that greedy with other users .. Thanks for any hint, mattia [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor at r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
Sory for the last mail, I hit the wrong buton Simon No?l CdeC ________________________________________ De : bioconductor-bounces at r-project.org [bioconductor-bounces at r-project.org] de la part de Simon No?l [simon.noel.2 at ulaval.ca] Date d'envoi : 26 juillet 2011 13:12 ? : mattia pelizzola; bioconductor Objet : [BioC] RE : occurrence of rGADEM motifs Vous en pensez quoi? En plus, ?a utilise Jaspar, une base de donn?e tr?s performante et compl?te que nous n'utilisons ?s encore. Simon No?l CdeC ________________________________________ De : bioconductor-bounces at r-project.org [bioconductor-bounces at r-project.org] de la part de mattia pelizzola [mattia.pelizzola at gmail.com] Date d'envoi : 26 juillet 2011 06:51 ? : bioconductor Objet : [BioC] occurrence of rGADEM motifs Hi, I am using rGADEM and MotIV to find out enriched motifs in my ChIPseq peaks and determine the similarity with Jaspar TFBS. These tools look very useful! rGADEM provides a list of enriched motifs. The total number of motifs is provided by the nOccurrences function, but I can't find a way to get to know which peak regions do contain these motifs. In particular, what are the startPos and endPos functions supposed to do? I would expect a set of genomic positions (or positions relative to the peak regions) with the same length as nOccurrences, but I only get one number for each motif, with no chromosome associated. Even in the rGADEM vignette you have nOccurrences equal to 60 but then you get only one number out of the startPos and endPos functions. Am I missing or misunderstanding anything? Additionally, I was also wondering if it is possible to control the max number of processors used in the analysis. I am working on a cluster shared between many people and apparently the software uses as many processors as possible, while I do not want to be that greedy with other users .. Thanks for any hint, mattia [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor at r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor _______________________________________________ Bioconductor mailing list Bioconductor at r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
Heidi Dvinge ★ 2.0k
@heidi-dvinge-2195
Last seen 10.3 years ago
Hi Mattia, > Hi, > I am using rGADEM and MotIV to find out enriched motifs in my ChIPseq > peaks > and determine the similarity with Jaspar TFBS. These tools look very > useful! > > rGADEM provides a list of enriched motifs. The total number of motifs is > provided by the nOccurrences function, but I can't find a way to get to > know > which peak regions do contain these motifs. In particular, what are the > startPos and endPos functions supposed to do? I would expect a set of > genomic positions (or positions relative to the peak regions) with the > same > length as nOccurrences, but I only get one number for each motif, with no > chromosome associated. I don't really know rGADEM, so I can't tell you if there's a direct way of doing that. You can however extract the PWM itself using getPWM (rGADEM), and then match it to your sequence(s) of interest with matchPWM (Biostrings). The latter will also let you control the score, i.e. how similar the found motif should be to your PWM of interest. I seem to remember that rGADEM sometimes gives you quite long motifs, where the information content around the edges is quite low. If you extract the actual PWMs you might therefore want to consider trimming them before matching to your sequences. HTH \Heidi > Even in the rGADEM vignette you have nOccurrences equal to 60 but then you > get only one number out of the startPos and endPos functions. Am I missing > or misunderstanding anything? > > Additionally, I was also wondering if it is possible to control the max > number of processors used in the analysis. I am working on a cluster > shared > between many people and apparently the software uses as many processors as > possible, while I do not want to be that greedy with other users .. > > Thanks for any hint, > > mattia > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT
0
Entering edit mode
@charles-joly-beauparlant-4777
Last seen 5.7 years ago
Canada
Hi mattia, About the number of processors used: rGADEM is using openMP to speed up certain parts of the calculation. So in order to control the maximum number of processor used, one of your option would be to set the OMP_NUM_THREADS environment variable. For example, you can write the command "export OMP_NUM_THREADS=8" on a linux terminal before starting a rGADEM analysis and openMP will only use 8 threads. I do not have access to other operating system right now to test this, but I found informations on setting the OMP_NUM_THREADS environment variable in this page: http://software.intel.com/sites/products/documentation/hpc/composerxe /en-us/cpp/lin/optaps/common/optaps_par_var.htm Best regards, Charles Joly Beauparlant. > Hi, > I am using rGADEM and MotIV to find out enriched motifs in my ChIPseq peaks > and determine the similarity with Jaspar TFBS. These tools look very > useful! > > rGADEM provides a list of enriched motifs. The total number of motifs is > provided by the nOccurrences function, but I can't find a way to get to > know > which peak regions do contain these motifs. In particular, what are the > startPos and endPos functions supposed to do? I would expect a set of > genomic positions (or positions relative to the peak regions) with the same > length as nOccurrences, but I only get one number for each motif, with no > chromosome associated. > Even in the rGADEM vignette you have nOccurrences equal to 60 but then you > get only one number out of the startPos and endPos functions. Am I missing > or misunderstanding anything? > > Additionally, I was also wondering if it is possible to control the max > number of processors used in the analysis. I am working on a cluster shared > between many people and apparently the software uses as many processors as > possible, while I do not want to be that greedy with other users .. > > Thanks for any hint, > > mattia > [[alternative HTML version deleted]]
ADD COMMENT

Login before adding your answer.

Traffic: 629 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6