annotation and GO for non-model organism

0

Entering edit mode

Ingunn Berget ▴ 150

@ingunn-berget-1066

Last seen 11.3 years ago

Dear List I want to do GO analysis on my microarray results, and have not done this before. We have a cDNA array for a non-model organism. The manufacturers of the array have provided annotations, so I have Accesision number, gene description, gene synonyms, EC, molecular_function, biological_process, cellular_component, InterPro, KEGG, Pfam, EMBL, Ensembl, UniGene, RefSeq, PROSITE, GeneId, org, and more in a tab delimited txt file. so I suppose I have all the information I need, how can I use this with the bioconductor packages? I have looked at the vignette for SQLForge in the AnnotationDbi package as suggested on this list before, but as it says "At the present time, it is possible to make annotation packages for the most common model organisms" I don't know how to proceed. Best regards Ingunn

Microarray Annotation GO Organism Microarray Annotation GO Organism • 2.9k views

ADD COMMENT • link updated 16.3 years ago by Marc Carlson ★ 7.2k • written 16.3 years ago by Ingunn Berget ▴ 150

0

Entering edit mode

Wolfgang Huber ★ 13k

@wolfgang-huber-3550

Last seen 3 months ago

EMBL European Molecular Biology Laborat…

Hi Ingunn you could check out the Category package, which has tools for detecting association between gene annotation categories and differential expression contrasts - see its vignette. Best wishes Wolfgang ------------------------------------------------------- Wolfgang Huber EMBL http://www.embl.de/research/units/genome_biology/huber ------------------------------------------------------- Berget wrote: > Dear List > > I want to do GO analysis on my microarray results, and have not done this before. We have a cDNA array for a non-model organism. The manufacturers of the array have provided annotations, so I have > Accesision number, gene description, gene synonyms, EC, molecular_function, biological_process, cellular_component, InterPro, KEGG, Pfam, EMBL, Ensembl, UniGene, RefSeq, PROSITE, GeneId, org, > and more in a tab delimited txt file. > > so I suppose I have all the information I need, how can I use this with the bioconductor packages? > > I have looked at the vignette for SQLForge in the AnnotationDbi package as suggested on this list before, but as it says "At the present time, it is possible to make annotation packages for the > most common model organisms" I don't know how to proceed. > > Best regards > Ingunn > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD COMMENT • link 16.3 years ago Wolfgang Huber ★ 13k

0

Entering edit mode

Marc Carlson ★ 7.2k

@marc-carlson-2264

Last seen 9.4 years ago

United States

Hi Ingunn, First you should determine whether or not your organism is one of our supported organisms. Because you claim it is a non-model organism I suspect it might not be, but it's still worth determining this first. If it is, then you should be able to get an organism level package from our respository and use GOstats in a typical manner. Determining if it is should be straightforward for you. You can simply call the available.dbschemas() function in the AnnotationDbi package to determine if your organism is supported by a schema. If it is not, we have a new workaround for you that will work with the latest versions of the AnnoationDbi, GSEABase, GOstats and Category, packages which are presently in our development branch. Since I suspect you will need the latter strategy, below is an example of how you should be able to proceed. It is very similiar to how you would use the GOstats package traditionally, and you should probably read the vignette for that package before attempting this for a more detailed explanation. Please note that in the following example "frameData" is a data.frame object with 3 cols set to be GO IDs, evidence codes and gene IDs respectively. This is how you can introduce the specific details from your organism. Also, you will want to be careful to ensure that your gene IDs should match the type of the IDs in your 'universeGeneIds' and 'geneIds' and you should use a type of ID that is truly unique (I recommend something like entrez gene IDs). library("GOstats") library("GSEABase") library("AnnotationDbi") frame=GOFrame(frameData,organism="Homo sapiens") allFrame=GOAllFrame(frame) gsc <- GeneSetCollection(allFrame, setType = GOCollection()) params <- GSEAGOHyperGParams(name="My Custom GSEA based annot Params", geneSetCollection=gsc, geneIds = genes, universeGeneIds = universe, ontology = "MF", pvalueCutoff = 0.05, conditio nal = FALSE, testDirection = "over") Over <- hyperGTest(params) Please let me know if you have questions or comments. This is a new capability, that we are adding so that we can provide better support for non-model organisms. Marc Ingunn Berget wrote: > Dear List > > I want to do GO analysis on my microarray results, and have not done this before. We have a cDNA array for a non-model organism. The manufacturers of the array have provided annotations, so I have > Accesision number, gene description, gene synonyms, EC, molecular_function, biological_process, cellular_component, InterPro, KEGG, Pfam, EMBL, Ensembl, UniGene, RefSeq, PROSITE, GeneId, org, > and more in a tab delimited txt file. > > so I suppose I have all the information I need, how can I use this with the bioconductor packages? > > I have looked at the vignette for SQLForge in the AnnotationDbi package as suggested on this list before, but as it says "At the present time, it is possible to make annotation packages for the > most common model organisms" I don't know how to proceed. > > Best regards > Ingunn > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > >

ADD COMMENT • link 16.3 years ago Marc Carlson ★ 7.2k

Login before adding your answer.