User does not exist
Building the tomato annotation library(Affy)
1
0
Entering edit mode
@jorge-mena-ali-5651
Last seen 11.3 years ago
I'm trying to obtain the annotation file for the Affy tomato chip. Any suggestions on specific code to append this file to the eset will be appreciated. Jorge **************************** Jorge Mena-Ali, PhD Visiting Assistant Professor Dept of Biology, Franklin & Marshall College Lancaster PA 17604 **************************** [[alternative HTML version deleted]]
Annotation affy Annotation affy • 1.6k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 10 days ago
United States
Hi Jorge, On 12/10/2012 12:00 PM, Jorge Mena-Ali wrote: > I'm trying to obtain the annotation file for the Affy tomato chip. Any > suggestions on specific code to append this file to the eset will be > appreciated. There are two general ways to handle this situation (that I know of). 1.) Just use the Affy annotation file directly. 2.) Build an org package and then use say UniGene or Gene IDs from the annotation file to map things. For #1, you can download the csv file from Affy and do something like > dat <- read.csv("Tomato.na33.annot.csv", header = TRUE, skip = 13, na.string = "---") > names(dat) [1] "Probe.Set.ID" "GeneChip.Array" [3] "Species.Scientific.Name" "Annotation.Date" [5] "Sequence.Type" "Sequence.Source" [7] "Transcript.ID.Array.Design." "Target.Description" [9] "Representative.Public.ID" "Archival.UniGene.Cluster" [11] "UniGene.ID" "Genome.Version" [13] "Alignments" "Gene.Title" [15] "Gene.Symbol" "Chromosomal.Location" [17] "Unigene.Cluster.Type" "Ensembl" [19] "Entrez.Gene" "SwissProt" [21] "EC" "OMIM" [23] "RefSeq.Protein.ID" "RefSeq.Transcript.ID" [25] "FlyBase" "AGI" [27] "WormBase" "MGI.Name" [29] "RGD.Name" "SGD.accession.number" [31] "Gene.Ontology.Biological.Process" "Gene.Ontology.Cellular.Component" [33] "Gene.Ontology.Molecular.Function" "Pathway" [35] "InterPro" "Trans.Membrane" [37] "QTL" "Annotation.Description" [39] "Annotation.Transcript.Cluster" "Transcript.Assignments" [41] "Annotation.Notes" and then you can use the existing functions in R to merge() (<- and that is a hint right there) the set of significant (or not) probesets with various annotations. However, the Affy annotations are static as to the build date, and may be pretty stale by the time you get to them. You can always go to NCBI and build your own organism-level package, and use that to do the annotations. > library(AnnotationForge) > makeOrgPackageFromNCBI(version = "0.0.1", author = "me", maintainer = "me <me at="" mine.org="">", outputDir = ".", tax_id = 4081, genus = "Solanum", species = "lycopersicum") Loading required package: GO.db Getting data for gene2pubmed.gz Loading required package: RCurl Loading required package: bitops Populating gene2pubmed table: table gene2pubmed filled Getting data for gene2accession.gz <other blahblahblah="" snipped=""> Creating package in ./org.Slycopersicum.eg.db [1] TRUE So after waiting a while, I get this message telling me a package has been made. And now I need to install. > install.packages("org.Slycopersicum.eg.db", repos = NULL, type = "source") * installing *source* package org.Slycopersicum.eg.db ... ** R ** inst ** preparing package for lazy loading ** help *** installing help indices ** building package indices ** testing if installed package can be loaded * DONE (org.Slycopersicum.eg.db) Now you can use this package to annotate things: > x <- as.character(sample(dat$UniGene.ID[!is.na(dat$UniGene.ID)], 25)) > select(org.Slycopersicum.eg.db, x, c("SYMBOL","GENENAME"), "UNIGENE") UNIGENE SYMBOL GENENAME 1 Les.20210 <na> <na> 2 Les.11435 <na> <na> 3 Les.12414 <na> <na> 4 Les.17835 SNF1 SNF1 protein 5 Les.1796 <na> <na> 6 --- <na> <na> 7 Les.1268 MKP1 MAP kinase phosphatase 8 Les.7575 <na> <na> 9 Les.7326 <na> <na> <snip> Best, Jim > > > > Jorge > > > > > > **************************** > > Jorge Mena-Ali, PhD > > Visiting Assistant Professor > > Dept of Biology, Franklin& Marshall College > > Lancaster PA 17604 > > **************************** > > > > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD COMMENT

Login before adding your answer.

Traffic: 1001 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6