custom-made annotation packages
2
0
Entering edit mode
Samuel Wuest ▴ 330
@samuel-wuest-2821
Last seen 9.6 years ago
Hi all, This is a conceptual question on how (if at all) to create custom-made annotations: I am using the Affymetrix plattform for Arabidopsis (ATH1), and the newest annotation package (AnnDbBimap objects mapping AffyIDs to e.g. GO- terms) is provided on the Bioconductor page, but: Casneuf et al (BMC Bioinformatics 2007, 8:461) have reannotated the Arabidopsis chip in order to get rid of cross- and nonhybridizing probes, and I am using the custom-made cdf-file to analyze my data. But the Affy_ID that naming the probesets have been replaced by gene accession numbers (Atg-numbers in this case) in the new cdf-file and this makes the annotation from Bioconductor useless to me: the keys used there are Affy_IDs. So obviously I have to make new mappings from gene accession numbers to e.g. GO-terms, but that information is available on databases. *My questions*: is it worth making a new annotation package for the chip, or could I just create my own environments that contain the mappings (if its only for my project)? What would be less work and still allow the main analyses (e.g. GO-enrichment etc)/be useful for the community? Also I could just try to map the gene accession numbers back to the original Affy_IDs and use the provided annotation package? And: is there an easy-to-read manual on how to create annotation packages (I know there are the vignettes, but I am not a bioinformatician)? Thanks a million for any feedback, best wishes, Sam [[alternative HTML version deleted]]
Annotation Annotation • 970 views
ADD COMMENT
0
Entering edit mode
Marc Carlson ★ 7.2k
@marc-carlson-2264
Last seen 7.7 years ago
United States
Hi Sam, Maybe you should just use SQLForge to quickly make a custom annotation package for yourself. If you have ATG IDs already mapped to some kind of probeset ID, then this should be straightforward for you. Nobody has needed to do this yet, so you will be a guinea pig making a package for arabidopsis, but it should work, and if it works, it should be fairly quick. Have a look at the Vignette to see the more general use case and let me know if you have questions. You can find the vignette here: http://bioconductor.org/packages/2.2/bioc/html/AnnotationDbi.html All you will need is some sort of mapping from whatever you want to use as probe labels to arabidopsis "TAIR" gene IDs (these look like: ATxxxxxxx). This mapping can then be used to make a custom annotation package. Marc Samuel Wuest wrote: > Hi all, > > This is a conceptual question on how (if at all) to create custom- made > annotations: > I am using the Affymetrix plattform for Arabidopsis (ATH1), and the newest > annotation package (AnnDbBimap objects mapping AffyIDs to e.g. GO- terms) is > provided on the Bioconductor page, but: > Casneuf et al (BMC Bioinformatics 2007, 8:461) have reannotated the > Arabidopsis chip in order to get rid of cross- and nonhybridizing probes, > and I am using the custom-made cdf-file to analyze my data. But the Affy_ID > that naming the probesets have been replaced by gene accession numbers > (Atg-numbers in this case) in the new cdf-file and this makes the annotation > from Bioconductor useless to me: the keys used there are Affy_IDs. > > So obviously I have to make new mappings from gene accession numbers to e.g. > GO-terms, but that information is available on databases. > > *My questions*: is it worth making a new annotation package for the chip, or > could I just create my own environments that contain the mappings (if its > only for my project)? What would be less work and still allow the main > analyses (e.g. GO-enrichment etc)/be useful for the community? > Also I could just try to map the gene accession numbers back to the original > Affy_IDs and use the provided annotation package? > And: is there an easy-to-read manual on how to create annotation packages (I > know there are the vignettes, but I am not a bioinformatician)? > > Thanks a million for any feedback, best wishes, Sam > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > >
ADD COMMENT
0
Entering edit mode
Samuel Wuest ▴ 330
@samuel-wuest-2821
Last seen 9.6 years ago
Hi Marc, thanks for the quick reply! So you guess it should be rather easy to make such an annotation package? Just to be correct with this: the ath1121501.db package uses the affy_id as a global identifier (key values), not ATG Ids? I'd have to create all the mappings from scratch (the GO-mappings, the KEGG mappings, etc). Best, Sam 2008/5/28 Marc Carlson <mcarlson@fhcrc.org>: > Hi Sam, > > Maybe you should just use SQLForge to quickly make a custom annotation > package for yourself. If you have ATG IDs already mapped to some kind of > probeset ID, then this should be straightforward for you. Nobody has needed > to do this yet, so you will be a guinea pig making a package for > arabidopsis, but it should work, and if it works, it should be fairly quick. > Have a look at the Vignette to see the more general use case and let me > know if you have questions. > > You can find the vignette here: > > http://bioconductor.org/packages/2.2/bioc/html/AnnotationDbi.html > > All you will need is some sort of mapping from whatever you want to use as > probe labels to arabidopsis "TAIR" gene IDs (these look like: ATxxxxxxx). > This mapping can then be used to make a custom annotation package. > > > Marc > > > > Samuel Wuest wrote: > >> Hi all, >> >> This is a conceptual question on how (if at all) to create custom- made >> annotations: >> I am using the Affymetrix plattform for Arabidopsis (ATH1), and the newest >> annotation package (AnnDbBimap objects mapping AffyIDs to e.g. GO- terms) >> is >> provided on the Bioconductor page, but: >> Casneuf et al (BMC Bioinformatics 2007, 8:461) have reannotated the >> Arabidopsis chip in order to get rid of cross- and nonhybridizing probes, >> and I am using the custom-made cdf-file to analyze my data. But the >> Affy_ID >> that naming the probesets have been replaced by gene accession numbers >> (Atg-numbers in this case) in the new cdf-file and this makes the >> annotation >> from Bioconductor useless to me: the keys used there are Affy_IDs. >> >> So obviously I have to make new mappings from gene accession numbers to >> e.g. >> GO-terms, but that information is available on databases. >> >> *My questions*: is it worth making a new annotation package for the chip, >> or >> could I just create my own environments that contain the mappings (if its >> only for my project)? What would be less work and still allow the main >> analyses (e.g. GO-enrichment etc)/be useful for the community? >> Also I could just try to map the gene accession numbers back to the >> original >> Affy_IDs and use the provided annotation package? >> And: is there an easy-to-read manual on how to create annotation packages >> (I >> know there are the vignettes, but I am not a bioinformatician)? >> >> Thanks a million for any feedback, best wishes, Sam >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> >> > > > [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
Hi Sam, SQLForge will make almost all of the mappings (GO, KEGG, etc) for you. All you need to provide is a single mapping from whatever probe type of IDs you want to use mapped onto the appropriate TAIR IDs. Then you should only have to download two packages and call a single function to make the package (read the SQLForge vignette for details). The ath1121501.db package that you are referring to was made using this same function, only its mapping was a file from Affymetrix. As long as you know what probe goes with what TAIR ID, we should be able to make a package from that. So for example you just need to have a bunch of IDs (these can be anything you want) that are mapped to TAIR IDs (which look like: ATxxxxxxx) then you need a tab separated file that looks something like: myID1 AT1G01040 myID2 AT1G01060 myID3 AT1G01120 etc. Marc Samuel Wuest wrote: > Hi Marc, thanks for the quick reply! So you guess it should be rather > easy to make such an annotation package? > Just to be correct with this: the ath1121501.db package uses the > affy_id as a global identifier (key values), not ATG Ids? > I'd have to create all the mappings from scratch (the GO-mappings, the > KEGG mappings, etc). > > Best, Sam > > 2008/5/28 Marc Carlson <mcarlson at="" fhcrc.org="" <mailto:mcarlson="" at="" fhcrc.org="">>: > > Hi Sam, > > Maybe you should just use SQLForge to quickly make a custom > annotation package for yourself. If you have ATG IDs already > mapped to some kind of probeset ID, then this should be > straightforward for you. Nobody has needed to do this yet, so you > will be a guinea pig making a package for arabidopsis, but it > should work, and if it works, it should be fairly quick. Have a > look at the Vignette to see the more general use case and let me > know if you have questions. > > You can find the vignette here: > > http://bioconductor.org/packages/2.2/bioc/html/AnnotationDbi.html > > All you will need is some sort of mapping from whatever you want > to use as probe labels to arabidopsis "TAIR" gene IDs (these look > like: ATxxxxxxx). This mapping can then be used to make a custom > annotation package. > > > Marc > > > > Samuel Wuest wrote: > > Hi all, > > This is a conceptual question on how (if at all) to create > custom-made > annotations: > I am using the Affymetrix plattform for Arabidopsis (ATH1), > and the newest > annotation package (AnnDbBimap objects mapping AffyIDs to e.g. > GO-terms) is > provided on the Bioconductor page, but: > Casneuf et al (BMC Bioinformatics 2007, 8:461) have > reannotated the > Arabidopsis chip in order to get rid of cross- and > nonhybridizing probes, > and I am using the custom-made cdf-file to analyze my data. > But the Affy_ID > that naming the probesets have been replaced by gene accession > numbers > (Atg-numbers in this case) in the new cdf-file and this makes > the annotation > from Bioconductor useless to me: the keys used there are Affy_IDs. > > So obviously I have to make new mappings from gene accession > numbers to e.g. > GO-terms, but that information is available on databases. > > *My questions*: is it worth making a new annotation package > for the chip, or > could I just create my own environments that contain the > mappings (if its > only for my project)? What would be less work and still allow > the main > analyses (e.g. GO-enrichment etc)/be useful for the community? > Also I could just try to map the gene accession numbers back > to the original > Affy_IDs and use the provided annotation package? > And: is there an easy-to-read manual on how to create > annotation packages (I > know there are the vignettes, but I am not a bioinformatician)? > > Thanks a million for any feedback, best wishes, Sam > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > <mailto:bioconductor at="" stat.math.ethz.ch=""> > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > >
ADD REPLY

Login before adding your answer.

Traffic: 864 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6