Entering edit mode
Simon No?l <simon.noel.2 at="" ulaval.ca=""> writes:
> Whell, now I got the result I wanted. In R, I use
>
>>library(org.Hs.eg.db)
>
>>mysymbols <- scan("./Gene_list.csv", what = 'character', skip =1)
>>myEgIDs <- unlist(mget(mysymbols, org.Hs.egSYMBOL2EG))
>>write.table(myEgIDs,file="myEgIDs.xls")
>
> After, I use my .xls with the program stripe.
Hi Simon --
You say the summer has just begun, so... Here's something that might
be fun to work with...
I use the lattice package to create the plot. I load the package,
define two 'helper' functions, and then a wrapper to do what I want...
library(lattice)
prepanel.idio <-
function(chrs, ..., xlim)
{
if (missing(xlim))
list(xlim=c(1, max(chrs[["Length"]])))
else
list(xlim=xlim)
}
panel.idio <-
function(x, y, chrs, ..., gtick=.2, colp="blue", colm="red")
{
## helpers: y coordinates at integer values
yy <- unique(y)
o <- order(yy)
yi <- match(y, yy[o])
## draw 'chromosomes'
with(chrs, {
idx <- match(yy, Chromosome)
panel.segments(1, seq_along(yy), Length[idx][o],
seq_along(yy))
})
## plot genes
panel.segments(abs(x), yi, abs(x),
ifelse(x<0, yi-gtick, yi+gtick),
..., col=ifelse(x<0, colm, colp))
}
idioplot <-
function(chrs, ..., prepanel=prepanel.idio, panel=panel.idio)
{
xyplot(..., prepanel=prepanel, panel=panel, chrs=chrs)
}
Then I get the data needed to do the plots
library(org.Hs.eg.db)
sym <- c("ACSL1", "ACTG2", "ADAMTSL2", "ADH1C",
"ADORA1", "ADRA2A", "AHNAK", "AIM2", "ALAS2")
id <- unlist(mget(sym, org.Hs.egSYMBOL2EG))
chrloc <- mget(id, org.Hs.egCHRLOC)
chr <- unlist(lapply(chrloc, names), use.names=FALSE)
and form it into the data structures needed to use my functions
genes <- data.frame(Id=rep(id, sapply(chrloc, length)),
Chromosome=factor(chr,
levels=c(1:22, "X", "Y", "M")),
Position=unlist(chrloc, use.names=FALSE))
chrs <- data.frame(Chromosome=factor(
names(org.Hs.egCHRLENGTHS),
levels=c(1:22, "X", "Y", "M")),
Length=org.Hs.egCHRLENGTHS)
Finally I draw the plot
idio <- idioplot(chrs, Chromosome~Position, genes)
show(idio)
I should have access to all the regular plot formating parameters (see
?xyplot, ?panel.xyplot, ?segments, ?panel.segments), in addition to
changing how long the gene markings are, and the colors of the
segments for each gene. You even have 'zoom' capabilities
idioplot(chrs, Chromosome~Position, genes[genes$Chromosome=="11", ],
xlim=c(61000000, 63000000))
or, a little more hack-y (I don't know how to update the y limits when
y is a factor)
update(idio, ylim=c(5.5,6.5), xlim=c(61000000, 63000000))
More generally you might look at rtracklayer (for exporting to the
UCSC genome browser), GenomeGraph (for plotting gene-level
information, for instance), geneplotter, idiogram, ...
Martin
> Thans's you for your help.
>
> Does anyone have an idea about my problem with qc()?
>
> Selon Simon No?l <simon.noel.2 at="" ulaval.ca="">, 23.05.2009:
>
>> I have try your procedure and every thing is loking fine for now.
What's the
>> netx step to plot my genes on their chromosomes?
>>
>> Selon Simon No?l <simon.noel.2 at="" ulaval.ca="">, 21.05.2009:
>>
>> > I am sorry. Has I say, I am new to R and bioconductor. I am
still
>> learning.
>> >
>> > I will try this tomorow and I will write you back to tell you the
result.
>> > Thank's you for your help. Il it's working, I will only have to
wory about
>> > my
>> > problem with qc()... Or at least for now. The summer is really
young;)
>> > Selon Hervé Pagès <hpages at="" fhcrc.org="">, 21.05.2009:
>> >
>> > > Hi Simon,
>> > >
>> > > Not a good idea to start a new thread by replying to a
different thread
>> > > you started previously. Then it shows up under the previous
thread even
>> > > if you changed the subject.
>> > >
>> > > more below...
>> > >
>> > > Simon No?l wrote:
>> > > > Hello every one. I have a question. I have a gene list in a
.xls like
>> > > >
>> > > > probeID Symbol
>> > > > 1030431 ACSL1
>> > > > 4610431 ACTG2
>> > > > 4810575 ADAMTSL2
>> > > > 1510750 ADH1C
>> > > > 4060519 ADORA1
>> > > > 5720523 ADRA2A
>> > > > 2810482 AHNAK
>> > > > 1260270 AIM2
>> > > > 4180768 ALAS2
>> > > > ... ...
>> > > >
>> > > > I want to plote all of those genes on their chromosome. How
can I do
>> > this?
>> > >
>> > > So first you need to map each gene to its chromosome location.
>> > >
>> > > You can use one of the org.*.eg.db annotation packages for
>> > > this (pick up the one for your organism):
>> > >
>> > > http://bioconductor.org/packages/release/data/annotation/
>> > >
>> > > and use the SYMBOL2EG map to map your gene symbols to their
corresponding
>> > > Entrez IDs and then the CHRLOC map to map your Entrez IDs to
their
>> > chromosome
>> > > locations.
>> > >
>> > > Example:
>> > >
>> > > library(org.Hs.eg.db)
>> > > mysymbols <- c("ACSL1", "ACTG2", "ADAMTSL2", "ADH1C",
>> > > "ADORA1", "ADRA2A", "AHNAK", "AIM2", "ALAS2")
>> > > myEgIDs <- unlist(mget(mysymbols, org.Hs.egSYMBOL2EG))
>> > > mylocs <- unname(unlist(mget(myEgIDs, org.Hs.egCHRLOC)))
>> > >
>> > > One thing to be aware of is that those mappings are not
necessarily
>> > > one-to-one e.g. the same symbol can be associated with
different genes:
>> > >
>> > > > flat <- toTable(org.Hs.egSYMBOL2EG)
>> > > > names(flat)
>> > > [1] "gene_id" "symbol"
>> > > > any(duplicated(flat$gene_id))
>> > > [1] FALSE
>> > > > any(duplicated(flat$symbol))
>> > > [1] TRUE
>> > >
>> > > The same thing happens with the org.Hs.egCHRLOC map (I'm not
sure
>> > > why we have this though, may be others on the list can
explain).
>> > >
>> > > Anyway this explains why 'mylocs' can have more elements than
>> 'mysymbols'.
>> > >
>> > > Cheers,
>> > > H.
>> > >
>> > > >
>> > > > Simon No?l
>> > > > VP Externe CADEUL
>> > > > Association des ?tudiants et ?tudiantes en Biochimie, Bio-
>> > > > informatique et Microbiologie de l'Universit? Laval
>> > > > CdeC
>> > > >
>> > > > _______________________________________________
>> > > > Bioconductor mailing list
>> > > > Bioconductor at stat.math.ethz.ch
>> > > > https://stat.ethz.ch/mailman/listinfo/bioconductor
>> > > > Search the archives:
>> > >
http://news.gmane.org/gmane.science.biology.informatics.conductor
>> > >
>> > > --
>> > > Hervé Pagès
>> > >
>> > > Program in Computational Biology
>> > > Division of Public Health Sciences
>> > > Fred Hutchinson Cancer Research Center
>> > > 1100 Fairview Ave. N, M2-B876
>> > > P.O. Box 19024
>> > > Seattle, WA 98109-1024
>> > >
>> > > E-mail: hpages at fhcrc.org
>> > > Phone: (206) 667-5791
>> > > Fax: (206) 667-1319
>> > >
>> > >
>> >
>> >
>> > Simon No?l
>> > VP Externe CADEUL
>> > Association des ?tudiants et ?tudiantes en Biochimie, Bio-
>> > informatique et Microbiologie de l'Universit? Laval
>> > CdeC
>>
>>
>> Simon No?l
>> VP Externe CADEUL
>> Association des ?tudiants et ?tudiantes en Biochimie, Bio-
>> informatique et Microbiologie de l'Universit? Laval
>> CdeC
>
>
> Simon No?l
> VP Externe CADEUL
> Association des ?tudiants et ?tudiantes en Biochimie, Bio-
> informatique et Microbiologie de l'Universit? Laval
> CdeC
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
--
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109
Location: Arnold Building M1 B861
Phone: (206) 667-2793