Working with non-type strain annotation
1
0
Entering edit mode
@thomas-lin-pedersen-5941
Last seen 8.3 years ago
Copenhagen, Denmark
Hi I'm doing proteomics on industrial bacterial strains. The genomes of these strains are almost completed (no joining of contigs) and my main genomic data is thus a list of CDS's. I have functionally annotated these using Blast2Go, and have thus GO terms, possibly EC number and Uniprot ID for the closest match for most of the CDS's. My question is thus: How do I best proceed with this data in the Bioconductor framework, when I want to do things suchs as gene set enrichment analysis etc. Is the best approach to build my own Annotation packages for each strain or is there a simpler 'ad hoc' data structure that supports the same functionality? It seems that most of the tutorials etc. supposes that you work on type strains (which is also probably true for the most part) where an annotation package is readily available? best Thomas
Proteomics GO genomes Proteomics GO genomes • 1.3k views
ADD COMMENT
0
Entering edit mode
Marc Carlson ★ 7.2k
@marc-carlson-2264
Last seen 7.7 years ago
United States
Hi Thomas, Have you looked at the makeOrgPackageFromNCBI() function in the AnnotationForge package? library(AnnotationForge) ?makeOrgPackageFromNCBI It is sometimes useful for cases where you have less common organisms. However, in your case it might not work since there is a chance that even NCBI may not have annotations available for your organisms. If that is the case, then you would have to do some more custom work (depending on what information you actually do have). Marc On 05/16/2013 01:30 AM, Thomas Dybdal Pedersen wrote: > Hi > > I'm doing proteomics on industrial bacterial strains. The genomes of these strains are almost completed (no joining of contigs) and my main genomic data is thus a list of CDS's. I have functionally annotated these using Blast2Go, and have thus GO terms, possibly EC number and Uniprot ID for the closest match for most of the CDS's. > > My question is thus: How do I best proceed with this data in the Bioconductor framework, when I want to do things suchs as gene set enrichment analysis etc. Is the best approach to build my own Annotation packages for each strain or is there a simpler 'ad hoc' data structure that supports the same functionality? > > It seems that most of the tutorials etc. supposes that you work on type strains (which is also probably true for the most part) where an annotation package is readily available? > > best > > Thomas > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT

Login before adding your answer.

Traffic: 665 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6