Building the tomato annotation library(Affy)

0

Entering edit mode

Jorge Mena-Ali ▴ 10

@jorge-mena-ali-5651

Last seen 11.4 years ago

I'm trying to obtain the annotation file for the Affy tomato chip. Any suggestions on specific code to append this file to the eset will be appreciated. Jorge **************************** Jorge Mena-Ali, PhD Visiting Assistant Professor Dept of Biology, Franklin & Marshall College Lancaster PA 17604 **************************** [[alternative HTML version deleted]]

Annotation affy Annotation affy • 1.6k views

ADD COMMENT • link updated 13.1 years ago by James W. MacDonald 68k • written 13.1 years ago by Jorge Mena-Ali ▴ 10

0

Entering edit mode

James W. MacDonald 68k

@james-w-macdonald-5106

Last seen 4 hours ago

United States

Hi Jorge, On 12/10/2012 12:00 PM, Jorge Mena-Ali wrote: > I'm trying to obtain the annotation file for the Affy tomato chip. Any > suggestions on specific code to append this file to the eset will be > appreciated. There are two general ways to handle this situation (that I know of). 1.) Just use the Affy annotation file directly. 2.) Build an org package and then use say UniGene or Gene IDs from the annotation file to map things. For #1, you can download the csv file from Affy and do something like > dat <- read.csv("Tomato.na33.annot.csv", header = TRUE, skip = 13, na.string = "---") > names(dat) [1] "Probe.Set.ID" "GeneChip.Array" [3] "Species.Scientific.Name" "Annotation.Date" [5] "Sequence.Type" "Sequence.Source" [7] "Transcript.ID.Array.Design." "Target.Description" [9] "Representative.Public.ID" "Archival.UniGene.Cluster" [11] "UniGene.ID" "Genome.Version" [13] "Alignments" "Gene.Title" [15] "Gene.Symbol" "Chromosomal.Location" [17] "Unigene.Cluster.Type" "Ensembl" [19] "Entrez.Gene" "SwissProt" [21] "EC" "OMIM" [23] "RefSeq.Protein.ID" "RefSeq.Transcript.ID" [25] "FlyBase" "AGI" [27] "WormBase" "MGI.Name" [29] "RGD.Name" "SGD.accession.number" [31] "Gene.Ontology.Biological.Process" "Gene.Ontology.Cellular.Component" [33] "Gene.Ontology.Molecular.Function" "Pathway" [35] "InterPro" "Trans.Membrane" [37] "QTL" "Annotation.Description" [39] "Annotation.Transcript.Cluster" "Transcript.Assignments" [41] "Annotation.Notes" and then you can use the existing functions in R to merge() (<- and that is a hint right there) the set of significant (or not) probesets with various annotations. However, the Affy annotations are static as to the build date, and may be pretty stale by the time you get to them. You can always go to NCBI and build your own organism-level package, and use that to do the annotations. > library(AnnotationForge) > makeOrgPackageFromNCBI(version = "0.0.1", author = "me", maintainer = "me <me at="" mine.org="">", outputDir = ".", tax_id = 4081, genus = "Solanum", species = "lycopersicum") Loading required package: GO.db Getting data for gene2pubmed.gz Loading required package: RCurl Loading required package: bitops Populating gene2pubmed table: table gene2pubmed filled Getting data for gene2accession.gz <other blahblahblah="" snipped=""> Creating package in ./org.Slycopersicum.eg.db [1] TRUE So after waiting a while, I get this message telling me a package has been made. And now I need to install. > install.packages("org.Slycopersicum.eg.db", repos = NULL, type = "source") * installing *source* package org.Slycopersicum.eg.db ... ** R ** inst ** preparing package for lazy loading ** help *** installing help indices ** building package indices ** testing if installed package can be loaded * DONE (org.Slycopersicum.eg.db) Now you can use this package to annotate things: > x <- as.character(sample(dat$UniGene.ID[!is.na(dat$UniGene.ID)], 25)) > select(org.Slycopersicum.eg.db, x, c("SYMBOL","GENENAME"), "UNIGENE") UNIGENE SYMBOL GENENAME 1 Les.20210 <na> <na> 2 Les.11435 <na> <na> 3 Les.12414 <na> <na> 4 Les.17835 SNF1 SNF1 protein 5 Les.1796 <na> <na> 6 --- <na> <na> 7 Les.1268 MKP1 MAP kinase phosphatase 8 Les.7575 <na> <na> 9 Les.7326 <na> <na> <snip> Best, Jim > > > > Jorge > > > > > > **************************** > > Jorge Mena-Ali, PhD > > Visiting Assistant Professor > > Dept of Biology, Franklin& Marshall College > > Lancaster PA 17604 > > **************************** > > > > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099

ADD COMMENT • link 13.1 years ago James W. MacDonald 68k

Login before adding your answer.