Question: How to plot gene on their chromosome?
0
gravatar for Martin Morgan
10.3 years ago by
Martin Morgan ♦♦ 23k
United States
Martin Morgan ♦♦ 23k wrote:
Simon No?l <simon.noel.2 at="" ulaval.ca=""> writes: > Whell, now I got the result I wanted. In R, I use > >>library(org.Hs.eg.db) > >>mysymbols <- scan("./Gene_list.csv", what = 'character', skip =1) >>myEgIDs <- unlist(mget(mysymbols, org.Hs.egSYMBOL2EG)) >>write.table(myEgIDs,file="myEgIDs.xls") > > After, I use my .xls with the program stripe. Hi Simon -- You say the summer has just begun, so... Here's something that might be fun to work with... I use the lattice package to create the plot. I load the package, define two 'helper' functions, and then a wrapper to do what I want... library(lattice) prepanel.idio <- function(chrs, ..., xlim) { if (missing(xlim)) list(xlim=c(1, max(chrs[["Length"]]))) else list(xlim=xlim) } panel.idio <- function(x, y, chrs, ..., gtick=.2, colp="blue", colm="red") { ## helpers: y coordinates at integer values yy <- unique(y) o <- order(yy) yi <- match(y, yy[o]) ## draw 'chromosomes' with(chrs, { idx <- match(yy, Chromosome) panel.segments(1, seq_along(yy), Length[idx][o], seq_along(yy)) }) ## plot genes panel.segments(abs(x), yi, abs(x), ifelse(x<0, yi-gtick, yi+gtick), ..., col=ifelse(x<0, colm, colp)) } idioplot <- function(chrs, ..., prepanel=prepanel.idio, panel=panel.idio) { xyplot(..., prepanel=prepanel, panel=panel, chrs=chrs) } Then I get the data needed to do the plots library(org.Hs.eg.db) sym <- c("ACSL1", "ACTG2", "ADAMTSL2", "ADH1C", "ADORA1", "ADRA2A", "AHNAK", "AIM2", "ALAS2") id <- unlist(mget(sym, org.Hs.egSYMBOL2EG)) chrloc <- mget(id, org.Hs.egCHRLOC) chr <- unlist(lapply(chrloc, names), use.names=FALSE) and form it into the data structures needed to use my functions genes <- data.frame(Id=rep(id, sapply(chrloc, length)), Chromosome=factor(chr, levels=c(1:22, "X", "Y", "M")), Position=unlist(chrloc, use.names=FALSE)) chrs <- data.frame(Chromosome=factor( names(org.Hs.egCHRLENGTHS), levels=c(1:22, "X", "Y", "M")), Length=org.Hs.egCHRLENGTHS) Finally I draw the plot idio <- idioplot(chrs, Chromosome~Position, genes) show(idio) I should have access to all the regular plot formating parameters (see ?xyplot, ?panel.xyplot, ?segments, ?panel.segments), in addition to changing how long the gene markings are, and the colors of the segments for each gene. You even have 'zoom' capabilities idioplot(chrs, Chromosome~Position, genes[genes$Chromosome=="11", ], xlim=c(61000000, 63000000)) or, a little more hack-y (I don't know how to update the y limits when y is a factor) update(idio, ylim=c(5.5,6.5), xlim=c(61000000, 63000000)) More generally you might look at rtracklayer (for exporting to the UCSC genome browser), GenomeGraph (for plotting gene-level information, for instance), geneplotter, idiogram, ... Martin > Thans's you for your help. > > Does anyone have an idea about my problem with qc()? > > Selon Simon No?l <simon.noel.2 at="" ulaval.ca="">, 23.05.2009: > >> I have try your procedure and every thing is loking fine for now. What's the >> netx step to plot my genes on their chromosomes? >> >> Selon Simon No?l <simon.noel.2 at="" ulaval.ca="">, 21.05.2009: >> >> > I am sorry. Has I say, I am new to R and bioconductor. I am still >> learning. >> > >> > I will try this tomorow and I will write you back to tell you the result. >> > Thank's you for your help. Il it's working, I will only have to wory about >> > my >> > problem with qc()... Or at least for now. The summer is really young;) >> > Selon Hervé Pagès <hpages at="" fhcrc.org="">, 21.05.2009: >> > >> > > Hi Simon, >> > > >> > > Not a good idea to start a new thread by replying to a different thread >> > > you started previously. Then it shows up under the previous thread even >> > > if you changed the subject. >> > > >> > > more below... >> > > >> > > Simon No?l wrote: >> > > > Hello every one. I have a question. I have a gene list in a .xls like >> > > > >> > > > probeID Symbol >> > > > 1030431 ACSL1 >> > > > 4610431 ACTG2 >> > > > 4810575 ADAMTSL2 >> > > > 1510750 ADH1C >> > > > 4060519 ADORA1 >> > > > 5720523 ADRA2A >> > > > 2810482 AHNAK >> > > > 1260270 AIM2 >> > > > 4180768 ALAS2 >> > > > ... ... >> > > > >> > > > I want to plote all of those genes on their chromosome. How can I do >> > this? >> > > >> > > So first you need to map each gene to its chromosome location. >> > > >> > > You can use one of the org.*.eg.db annotation packages for >> > > this (pick up the one for your organism): >> > > >> > > http://bioconductor.org/packages/release/data/annotation/ >> > > >> > > and use the SYMBOL2EG map to map your gene symbols to their corresponding >> > > Entrez IDs and then the CHRLOC map to map your Entrez IDs to their >> > chromosome >> > > locations. >> > > >> > > Example: >> > > >> > > library(org.Hs.eg.db) >> > > mysymbols <- c("ACSL1", "ACTG2", "ADAMTSL2", "ADH1C", >> > > "ADORA1", "ADRA2A", "AHNAK", "AIM2", "ALAS2") >> > > myEgIDs <- unlist(mget(mysymbols, org.Hs.egSYMBOL2EG)) >> > > mylocs <- unname(unlist(mget(myEgIDs, org.Hs.egCHRLOC))) >> > > >> > > One thing to be aware of is that those mappings are not necessarily >> > > one-to-one e.g. the same symbol can be associated with different genes: >> > > >> > > > flat <- toTable(org.Hs.egSYMBOL2EG) >> > > > names(flat) >> > > [1] "gene_id" "symbol" >> > > > any(duplicated(flat$gene_id)) >> > > [1] FALSE >> > > > any(duplicated(flat$symbol)) >> > > [1] TRUE >> > > >> > > The same thing happens with the org.Hs.egCHRLOC map (I'm not sure >> > > why we have this though, may be others on the list can explain). >> > > >> > > Anyway this explains why 'mylocs' can have more elements than >> 'mysymbols'. >> > > >> > > Cheers, >> > > H. >> > > >> > > > >> > > > Simon No?l >> > > > VP Externe CADEUL >> > > > Association des ?tudiants et ?tudiantes en Biochimie, Bio- >> > > > informatique et Microbiologie de l'Universit? Laval >> > > > CdeC >> > > > >> > > > _______________________________________________ >> > > > Bioconductor mailing list >> > > > Bioconductor at stat.math.ethz.ch >> > > > https://stat.ethz.ch/mailman/listinfo/bioconductor >> > > > Search the archives: >> > > http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > >> > > -- >> > > Hervé Pagès >> > > >> > > Program in Computational Biology >> > > Division of Public Health Sciences >> > > Fred Hutchinson Cancer Research Center >> > > 1100 Fairview Ave. N, M2-B876 >> > > P.O. Box 19024 >> > > Seattle, WA 98109-1024 >> > > >> > > E-mail: hpages at fhcrc.org >> > > Phone: (206) 667-5791 >> > > Fax: (206) 667-1319 >> > > >> > > >> > >> > >> > Simon No?l >> > VP Externe CADEUL >> > Association des ?tudiants et ?tudiantes en Biochimie, Bio- >> > informatique et Microbiologie de l'Universit? Laval >> > CdeC >> >> >> Simon No?l >> VP Externe CADEUL >> Association des ?tudiants et ?tudiantes en Biochimie, Bio- >> informatique et Microbiologie de l'Universit? Laval >> CdeC > > > Simon No?l > VP Externe CADEUL > Association des ?tudiants et ?tudiantes en Biochimie, Bio- > informatique et Microbiologie de l'Universit? Laval > CdeC > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Martin Morgan Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793
ADD COMMENTlink written 10.3 years ago by Martin Morgan ♦♦ 23k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 289 users visited in the last hour