Re: Defining your own chromosome annotations

0

Entering edit mode

Palmer, Lance ▴ 30

@palmer-lance-1222

Last seen 9.7 years ago

Sean, I am using Yersinia pestis microarray from TIGR. This chip is constructed from genes from two different strains of Y. pestis, KIM and CO92. It is mainly CO92 (3885 ORFs) but has 944 ORFS from KIM not found in CO92 (and some genes are on main chromosome, and some on virulence plasmids). The GAL file does not contain any genbank IDs. It does contain locus names. The gpr files I have do not contain the locus name, but instead has a name that is just used in the chip I believe. The genbank files for KIM and CO92 contain the locus names, and of course the genbank IDs. (ie In the GPR file the name of a gene is, for example, NTORF3478. In the gal file, the name NTORF3478 has the locus name y3526. Then the genbank file will contain the y3526 gene and have the GID of 1148473) I am not sure if all features in the chip are present in the genbank files. If I wanted to use the KIM chromosome as the reference, however, the genes from CO92 would not be mapped properly. I was just hoping there would be an easy way of having a file that would be something like Name on chip, Contig Name, Start, End Load that into an object, then after running limma, view expression along the chromosome. -Lance ------------------------------- Lance, You probably want to look at the AnnBuilder package, but I don't think it supports bacterial genomes (? for Jianhua). However, what is the annotation that you have for each "gene"? Genbank accession? Refseq? Do you have the chromosome locations? What arrays are you using? Sean ----- Original Message ----- From: "Palmer, Lance" <palmer@cshl.edu> To: <bioconductor@stat.math.ethz.ch> Sent: Monday, May 02, 2005 11:09 AM Subject: [BioC] Defining your own chromosome annotations >I am working with a number of bacterial genomes. I would like to define my >own chromosome and annotations along the chromosomes, then view gene >expression with regards to these genes. geneplotter and annotate seem to >use already available data structures. Is there a way for a use to design >there own? > > Thanks > Lance Palmer > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > ------------------------------ Message: 13 Date: Tue, 03 May 2005 11:13:51 +1200 From: "Marcus Davy" <mdavy@hortresearch.co.nz> Subject: Re: [BioC] LIMMA ignoring background To: <bioconductor@stat.math.ethz.ch>, <guoneng.zhong@yale.edu> Message-ID: <s2775d0c.081@hra2.marc.hort.cri.nz> Content-Type: text/plain; charset=US-ASCII there are several options, you could modify read.maimages so that it can read in only Rf Gf, you could read in the Rb and Gb information as the foreground information aswell and remove it afterwards from the list elements of the RGList, e.g. for GenePix, read.maimages(files[1], "genepix", columns= list(Rf = "F635 Mean", Gf = "F532 Mean", Rb = "F635 Mean", Gb = "F532 Mean")), or you could generate an RGList from scratch populating it with you data loaded into R using something like scan or read.table. e.g. library(limma) RG <- new("RGList") # Two arrays, unrealistic data... RG$R <- matrix(2^rnorm(8*4*20*20*2), nc=2) RG$G <- matrix(2^rnorm(8*4*20*20*2), nc=2) RG$printer <- list(ngrid.r=8, ngrid.c=4, nspot.r=20, nspot.c=20) RG$printer <- structure(printer, class = "PrintLayout") MA <- normalizeWithinArrays(RG, method = "printtiploess", bc.method="none") design <- rep(1,2) fit <- lmFit(MA, design) fit <- eBayes(fit) topTable(fit, adjust.method="fdr") You can populate exprSets and marrayRaw Class objects the same way using appropriate accessor methods. Marcus Marcus Davy Bioinformatics >>> Guoneng Zhong <guoneng.zhong@yale.edu> 05/03/05 8:39 AM >>> Hi, I have two-channel image result files that don't have background information, just median and mean intensity readings. But read.maimages requires that I provide Rb, Rf, Gb, Gf values, and I don't have the Rb and Gb values. How do I make the analysis ignore those columns? I am doing a simple lmFit and topTable. THanks! G -- Systems Programmer Yale Center for Medical Informatics fax: 203-737-5708 _______________________________________________ Bioconductor mailing list Bioconductor@stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor ______________________________________________________ The contents of this e-mail are privileged and/or confidenti...{{dropped}} ------------------------------ Message: 14 Date: Tue, 3 May 2005 01:46:22 +0200 From: "Gorjanc Gregor" <gregor.gorjanc@bfro.uni-lj.si> Subject: [BioC] "Special" characters in URI To: <r-help@stat.math.ethz.ch> Cc: bioconductor@stat.math.ethz.ch Message-ID: <7FFEE688B57D7346BC6241C55900E730B700C2@pollux.bfro.uni-lj.si> Content-Type: text/plain; charset="iso-8859-2" Hello! I am crossposting this to R-help and BioC, since it is relevant to both groups. I wrote a wrapper for Entrez search utility (link for this is provided bellow), which can add some new search functionality to existing code in Bioconductor's package 'annotate'*. http://eutils.ncbi.nlm.nih.gov/entrez/query/static/esearch_help.html Entrez search utuility returns a XML document but I have a problem to use URI to retrieve that file, since URI can also contain characters, which should not be there according to http://www.faqs.org/rfcs/rfc2396.html I encountered problems with "[" and "]" as well as with space characters. However there might also be a problem with others i.e. reserved characters in URI syntax. My R example is: R> library("annotate") Loading required package: Biobase Loading required package: tools Welcome to Bioconductor Vignettes contain introductory material. To view, simply type: openVignette() For details on reading vignettes, see the openVignette help page. R> library(XML) R> tmp$term <- "gorjanc g[au]" R> tmp$URL <- "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fc gi?term=gorjanc g[au]" R> tmp $term [1] "gorjanc g[au]" $URL [1] "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?term=go rjanc g[au]" R> xmlTreeParse(tmp$URL, isURL=TRUE, handlers=NULL, asTree=TRUE) Error in xmlTreeParse(tmp$URL, isURL = TRUE, handlers = NULL, asTree = TRUE) : error in creating parser for http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?term=gorjanc g[au] # so I have a problem with space and [ and ] # let's reduce a problem to just space or [] to be sure R> tmp$URL <- "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fc gi?term=gorjanc g" R> xmlTreeParse(tmp$URL, isURL=TRUE, handlers=NULL, asTree=TRUE) Error in xmlTreeParse(tmp$URL, isURL = TRUE, handlers = NULL, asTree = TRUE) : error in creating parser for http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?term=gorjanc g R> tmp$URL <- "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fc gi?term=gorjanc[au]" R> xmlTreeParse(tmp$URL, isURL=TRUE, handlers=NULL, asTree=TRUE) Error in xmlTreeParse(tmp$URL, isURL = TRUE, handlers = NULL, asTree = TRUE) : error in creating parser for http://eutils.ncbi.nlm.nih.gov/en trez/eutils/esearch.fcgi?term=gorjanc[au] # now show that it works fine without special chars R> tmp$URL <- "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fc gi?term=gorjanc" R> xmlTreeParse(tmp$URL, isURL=TRUE, handlers=NULL, asTree=TRUE) $doc $file [1] "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?term=go rjanc" $version [1] "1.0" $children ... # now show a workaround for space tmp$URL <- "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi? term=gorjanc%20g" xmlTreeParse(tmp$URL, isURL=TRUE, handlers=NULL, asTree=TRUE) R> tmp$URL <- "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fc gi?term=gorjanc%20g" R> xmlTreeParse(tmp$URL, isURL=TRUE, handlers=NULL, asTree=TRUE) $doc $file [1] "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?term=go rjanc%20g" $version [1] "1.0" $children ... As can be seen from above there is a possibility to handle this special characters and I wonder if this has already been done somewhere? If not I thought on a function fixURLchar, which would replace reserved characters with ther escaped sequences. Any comments, pointers, ... ? from = c(" ", "\"", ",", "#"), to = c("%20", "%22", "%2c", "%23")) *When I'll solve problem I will send my code to 'annotate' maintainer and he can include it at his will in a package. Lep pozdrav / With regards, Gregor Gorjanc ---------------------------------------------------------------------- University of Ljubljana Biotechnical Faculty URI: http://www.bfro.uni-lj.si/MR/ggorjan Zootechnical Department mail: gregor.gorjanc <at> bfro.uni-lj.si Groblje 3 tel: +386 (0)1 72 17 861 SI-1230 Domzale fax: +386 (0)1 72 17 888 Slovenia, Europe ---------------------------------------------------------------------- "One must learn by doing the thing; for though you think you know it, you have no certainty until you try." Sophocles ~ 450 B.C. ------------------------------ _______________________________________________ Bioconductor mailing list Bioconductor@stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor End of Bioconductor Digest, Vol 27, Issue 3 *******************************************

Microarray geneplotter limma AnnBuilder genomes Microarray geneplotter limma AnnBuilder • 1.3k views

ADD COMMENT • link 19.0 years ago Palmer, Lance ▴ 30

Login before adding your answer.