Entering edit mode
Chris Whelan
▴
60
@chris-whelan-4779
Last seen 10.3 years ago
Hi Mattia,
I just had to figure out how to access the alignment locations
returned by rGADEM also. The object returned by GADEM() contains a
list of "motif" objects which then have a slot "alignList" that has a
list of "align" objects. To pull out the locations for the second
motif, for example, this worked for me:
>chrs <- sapply(gadem[[2]]@alignList, slot, 'chr')
>starts <- sapply(gadem[[2]]@alignList, slot, 'start')
>ends <- sapply(gadem[[2]]@alignList, slot, 'end')
>positions <- sapply(gadem[[2]]@alignList, slot, 'pos')
>locations <- cbind(chrs, starts,ends,positions)
> head(locations)
chrs starts ends pos
[1,] "chr3" "11250871" "11251624" "167"
[2,] "chr7" "2975746" "2976319" "412"
[3,] "chr7" "129587981" "129588370" "140"
[4,] "chrX" "18735991" "18736550" "457"
[5,] "chr1" "40002871" "40003399" "232"
[6,] "chr1" "175910829" "175911459" "502"
I believe that starts and ends are the coordinates of the original
search regions you gave to GADEM and then "pos" is the offset location
within that region of the motif.
Hope that helps - if someone knows better, please correct me.
Chris
> Message: 4
> Date: Tue, 26 Jul 2011 12:51:33 +0200
> From: mattia pelizzola <mattia.pelizzola at="" gmail.com="">
> To: bioconductor <bioconductor at="" stat.math.ethz.ch="">
> Subject: [BioC] occurrence of rGADEM motifs
> Message-ID:
> ? ? ? ?<cag10-br939ye_+gss13l6zqhozuzcat7h_8ursk0fd9wi-7kxq at="" mail.gmail.com="">
> Content-Type: text/plain
>
> Hi,
> I am using rGADEM and MotIV to find out enriched motifs in my
ChIPseq peaks
> and determine the similarity with Jaspar TFBS. These tools look very
useful!
>
> rGADEM provides a list of enriched motifs. The total number of
motifs is
> provided by the nOccurrences function, but I can't find a way to get
to know
> which peak regions do contain these motifs. In particular, what are
the
> startPos and endPos functions supposed to do? I would expect a set
of
> genomic positions (or positions relative to the peak regions) with
the same
> length as nOccurrences, but I only get one number for each motif,
with no
> chromosome associated.
> Even in the rGADEM vignette you have nOccurrences equal to 60 but
then you
> get only one number out of the startPos and endPos functions. Am I
missing
> or misunderstanding anything?
>
> Additionally, I was also wondering if it is possible to control the
max
> number of processors used in the analysis. I am working on a cluster
shared
> between many people and apparently the software uses as many
processors as
> possible, while I do not want to be that greedy with other users ..
>
> Thanks for any hint,
>
> mattia