Creating annotation package with a new database schema
2
0
Entering edit mode
@fabian-grammes-5562
Last seen 9.6 years ago
Dear List I am working with Atlantic salmon and am highly interested to make a custom annotation package for the microarray that I am using. I've worked through the tutorial from Gabor Csardi ("Creating an annotation package with a new database schema" ), which was very helpful. However, I am struggling to implement the bimap objects to access the GO annotations that I have in the DB. The GO data is stored in 6 tables (BP, BP_all, MF, MF_all, CC, CC_all) looking like the format that I found for the organism packages in BioC: ID GOID evi 6092 GO:0000910 IEA 6092 GO:0040035 IEA 6092 GO:0000398 IEA So if someone could help me/ point me to the correct way how to implement the GO mappings in an annotation package that would be great. kind regards, Fabian
Annotation GO Organism Annotation GO Organism • 1.2k views
ADD COMMENT
0
Entering edit mode
Marc Carlson ★ 7.2k
@marc-carlson-2264
Last seen 7.7 years ago
United States
Hi Fabian, Have you seen this function (it's in AnnotationForge)? ?makeOrgPackageFromNCBI Marc On 10/22/2012 05:57 AM, Fabian Grammes wrote: > Dear List > > I am working with Atlantic salmon and am highly interested to make a > custom annotation package for the > microarray that I am using. > > I've worked through the tutorial from Gabor Csardi ("Creating an > annotation package with a new database > schema" ), which was very helpful. However, I am struggling to > implement the bimap objects to access > the GO annotations that I have in the DB. > > The GO data is stored in 6 tables (BP, BP_all, MF, MF_all, CC, CC_all) > looking like the format that I found for > the organism packages in BioC: > ID GOID evi > 6092 GO:0000910 IEA > 6092 GO:0040035 IEA > 6092 GO:0000398 IEA > > So if someone could help me/ point me to the correct way how to > implement the GO mappings > in an annotation package that would be great. > > kind regards, Fabian > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
@herve-pages-1542
Last seen 1 day ago
Seattle, WA, United States
Hi Fabian, On 10/22/2012 05:57 AM, Fabian Grammes wrote: > Dear List > > I am working with Atlantic salmon and am highly interested to make a > custom annotation package for the > microarray that I am using. > > I've worked through the tutorial from Gabor Csardi ("Creating an > annotation package with a new database > schema" ), which was very helpful. However, I am struggling to implement > the bimap objects to access > the GO annotations that I have in the DB. > > The GO data is stored in 6 tables (BP, BP_all, MF, MF_all, CC, CC_all) > looking like the format that I found for > the organism packages in BioC: > ID GOID evi > 6092 GO:0000910 IEA > 6092 GO:0040035 IEA > 6092 GO:0000398 IEA > > So if someone could help me/ point me to the correct way how to > implement the GO mappings > in an annotation package that would be great. If you look for example at the hgu95av2.db package, it provides 3 predefined Bimaps for accessing the GO data: hgu95av2GO (GO map), hgu95av2GO2PROBE (GO2PROBE map), and hgu95av2GO2ALLPROBES (GO2ALLPROBES map). The 1st is a direct map, the 2nd and 3rd are reverse maps: > direction(hgu95av2GO) [1] 1 > direction(hgu95av2GO2PROBE) [1] -1 > direction(hgu95av2GO2ALLPROBES) [1] -1 All of them are of class "ProbeGo3AnnDbBimap". The predefined Bimaps are created at load-time. The direct maps with a call to AnnotationDbi:::createAnnDbBimaps() and the reverse maps by "manually" reversing some of the direct maps returned by createAnnDbBimaps(). So you need to add an entry for the GO map to the list of "seeds" passed to createAnnDbBimaps(). In your case this entry needs to look something like (assuming ID is your internal id for genes): seeds <- list( ... list( objName="GO", Class="ProbeGo3AnnDbBimap", L2Rchain=list( list( tablename="probes", Lcolname="probe_id", Rcolname="gene_id", filter="{is_multiple}='0'" ), list( tablename="genes", Lcolname="gene_id", Rcolname="ID" ), list( Lcolname="ID", tagname=c(Evidence="{evi}"), Rcolname="GOID", Rattribnames=c(Ontology="NULL") ) ), rightTables=c(BP="BP", CC="CC", MF="MF") ) ... ) Then: ann_objs <- createAnnDbBimaps(seeds, seed0) where 'seed0' is defined by something like: seed0 <- list(objTarget="chip <name_of_your_chip>", datacache=datacache) and 'datacache' is the environment that will be used for package-level caching of the data loaded from the DB (use NULL for no caching, I'm assuming those extra details, which are not GO-specific, are covered in Gabor's document, but I don't know). Then you can append the reverse maps to 'ann_objs' with something like: ## Append GO2PROBE map: map <- ann_objs$GO map <- revmap(map) map at objName <- "GO2PROBE" ann_objs$GO2PROBE <- map ## Append GO2ALLPROBES map: map <- ann_objs$GO2PROBE map at rightTables <- c(BP="BP_all", CC="CC_all", MF="MF_all") map at objName <- "GO2ALLPROBES" ann_objs$GO2ALLPROBES <- map All this needs to happen at load-time (via the .onLoad hook). Again I'm focusing on the GO-specific part of the story here, assuming that you've already managed to create the non-GO specific maps (thanks to Gabor's document). Hope this helps, H. > > kind regards, Fabian > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319
ADD COMMENT
0
Entering edit mode
Hi Herv? Thanks a lot, that was exactly the information that I've been looking for ! After updating BioConductor today, I am struggling a bit with getting the code to work again, but that should be fixed tomorrow I hope :) @ Marc I've checked the function: makeOrgPackageFromNCBI, however since I have most of my annotation information stored locally (GO etc. - obtained via Blast2GO) and not yet available at NCBI, I do not think the function helps in my case. cheers, F On Oct 23, 2012, at 12:04 AM, Hervé Pagès wrote: > Hi Fabian, > > On 10/22/2012 05:57 AM, Fabian Grammes wrote: >> Dear List >> >> I am working with Atlantic salmon and am highly interested to make a >> custom annotation package for the >> microarray that I am using. >> >> I've worked through the tutorial from Gabor Csardi ("Creating an >> annotation package with a new database >> schema" ), which was very helpful. However, I am struggling to >> implement >> the bimap objects to access >> the GO annotations that I have in the DB. >> >> The GO data is stored in 6 tables (BP, BP_all, MF, MF_all, CC, >> CC_all) >> looking like the format that I found for >> the organism packages in BioC: >> ID GOID evi >> 6092 GO:0000910 IEA >> 6092 GO:0040035 IEA >> 6092 GO:0000398 IEA >> >> So if someone could help me/ point me to the correct way how to >> implement the GO mappings >> in an annotation package that would be great. > > If you look for example at the hgu95av2.db package, it provides 3 > predefined Bimaps for accessing the GO data: hgu95av2GO (GO map), > hgu95av2GO2PROBE (GO2PROBE map), and hgu95av2GO2ALLPROBES > (GO2ALLPROBES > map). The 1st is a direct map, the 2nd and 3rd are reverse maps: > > > direction(hgu95av2GO) > [1] 1 > > direction(hgu95av2GO2PROBE) > [1] -1 > > direction(hgu95av2GO2ALLPROBES) > [1] -1 > > All of them are of class "ProbeGo3AnnDbBimap". > > The predefined Bimaps are created at load-time. The direct maps > with a call to AnnotationDbi:::createAnnDbBimaps() and the reverse > maps by "manually" reversing some of the direct maps returned by > createAnnDbBimaps(). > > So you need to add an entry for the GO map to the list of "seeds" > passed to createAnnDbBimaps(). In your case this entry needs to look > something like (assuming ID is your internal id for genes): > > seeds <- list( > ... > list( > objName="GO", > Class="ProbeGo3AnnDbBimap", > L2Rchain=list( > list( > tablename="probes", > Lcolname="probe_id", > Rcolname="gene_id", > filter="{is_multiple}='0'" > ), > list( > tablename="genes", > Lcolname="gene_id", > Rcolname="ID" > ), > list( > Lcolname="ID", > tagname=c(Evidence="{evi}"), > Rcolname="GOID", > Rattribnames=c(Ontology="NULL") > ) > ), > rightTables=c(BP="BP", CC="CC", MF="MF") > ) > ... > ) > > Then: > > ann_objs <- createAnnDbBimaps(seeds, seed0) > > where 'seed0' is defined by something like: > > seed0 <- list(objTarget="chip <name_of_your_chip>", > datacache=datacache) > > and 'datacache' is the environment that will be used for package- level > caching of the data loaded from the DB (use NULL for no caching, I'm > assuming those extra details, which are not GO-specific, are covered > in Gabor's document, but I don't know). > > Then you can append the reverse maps to 'ann_objs' with something > like: > > ## Append GO2PROBE map: > map <- ann_objs$GO > map <- revmap(map) > map at objName <- "GO2PROBE" > ann_objs$GO2PROBE <- map > > ## Append GO2ALLPROBES map: > map <- ann_objs$GO2PROBE > map at rightTables <- c(BP="BP_all", CC="CC_all", MF="MF_all") > map at objName <- "GO2ALLPROBES" > ann_objs$GO2ALLPROBES <- map > > All this needs to happen at load-time (via the .onLoad hook). Again > I'm > focusing on the GO-specific part of the story here, assuming that > you've > already managed to create the non-GO specific maps (thanks to Gabor's > document). > > Hope this helps, > > H. > >> >> kind regards, Fabian >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > -- > Hervé Pagès > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: hpages at fhcrc.org > Phone: (206) 667-5791 > Fax: (206) 667-1319
ADD REPLY
0
Entering edit mode
Hi Fabian, If the data for GO is not available at NCBI, makeOrgPackageFromNCBI will try to use blast2GO instead (for GO at least). Marc On 10/23/2012 01:20 PM, Fabian Grammes wrote: > Hi Herv? > > Thanks a lot, that was exactly the information that I've been > looking for ! > > After updating BioConductor today, I am struggling a bit with > getting the code to work again, but that should be fixed tomorrow > I hope :) > > @ Marc > > I've checked the function: makeOrgPackageFromNCBI, > however since I have most of my annotation information stored locally > (GO etc. - obtained via Blast2GO) and not yet available at NCBI, > I do not think the function helps in my case. > > cheers, F > > On Oct 23, 2012, at 12:04 AM, Hervé Pagès wrote: > >> Hi Fabian, >> >> On 10/22/2012 05:57 AM, Fabian Grammes wrote: >>> Dear List >>> >>> I am working with Atlantic salmon and am highly interested to make a >>> custom annotation package for the >>> microarray that I am using. >>> >>> I've worked through the tutorial from Gabor Csardi ("Creating an >>> annotation package with a new database >>> schema" ), which was very helpful. However, I am struggling to >>> implement >>> the bimap objects to access >>> the GO annotations that I have in the DB. >>> >>> The GO data is stored in 6 tables (BP, BP_all, MF, MF_all, CC, CC_all) >>> looking like the format that I found for >>> the organism packages in BioC: >>> ID GOID evi >>> 6092 GO:0000910 IEA >>> 6092 GO:0040035 IEA >>> 6092 GO:0000398 IEA >>> >>> So if someone could help me/ point me to the correct way how to >>> implement the GO mappings >>> in an annotation package that would be great. >> >> If you look for example at the hgu95av2.db package, it provides 3 >> predefined Bimaps for accessing the GO data: hgu95av2GO (GO map), >> hgu95av2GO2PROBE (GO2PROBE map), and hgu95av2GO2ALLPROBES (GO2ALLPROBES >> map). The 1st is a direct map, the 2nd and 3rd are reverse maps: >> >> > direction(hgu95av2GO) >> [1] 1 >> > direction(hgu95av2GO2PROBE) >> [1] -1 >> > direction(hgu95av2GO2ALLPROBES) >> [1] -1 >> >> All of them are of class "ProbeGo3AnnDbBimap". >> >> The predefined Bimaps are created at load-time. The direct maps >> with a call to AnnotationDbi:::createAnnDbBimaps() and the reverse >> maps by "manually" reversing some of the direct maps returned by >> createAnnDbBimaps(). >> >> So you need to add an entry for the GO map to the list of "seeds" >> passed to createAnnDbBimaps(). In your case this entry needs to look >> something like (assuming ID is your internal id for genes): >> >> seeds <- list( >> ... >> list( >> objName="GO", >> Class="ProbeGo3AnnDbBimap", >> L2Rchain=list( >> list( >> tablename="probes", >> Lcolname="probe_id", >> Rcolname="gene_id", >> filter="{is_multiple}='0'" >> ), >> list( >> tablename="genes", >> Lcolname="gene_id", >> Rcolname="ID" >> ), >> list( >> Lcolname="ID", >> tagname=c(Evidence="{evi}"), >> Rcolname="GOID", >> Rattribnames=c(Ontology="NULL") >> ) >> ), >> rightTables=c(BP="BP", CC="CC", MF="MF") >> ) >> ... >> ) >> >> Then: >> >> ann_objs <- createAnnDbBimaps(seeds, seed0) >> >> where 'seed0' is defined by something like: >> >> seed0 <- list(objTarget="chip <name_of_your_chip>", >> datacache=datacache) >> >> and 'datacache' is the environment that will be used for package- level >> caching of the data loaded from the DB (use NULL for no caching, I'm >> assuming those extra details, which are not GO-specific, are covered >> in Gabor's document, but I don't know). >> >> Then you can append the reverse maps to 'ann_objs' with something like: >> >> ## Append GO2PROBE map: >> map <- ann_objs$GO >> map <- revmap(map) >> map at objName <- "GO2PROBE" >> ann_objs$GO2PROBE <- map >> >> ## Append GO2ALLPROBES map: >> map <- ann_objs$GO2PROBE >> map at rightTables <- c(BP="BP_all", CC="CC_all", MF="MF_all") >> map at objName <- "GO2ALLPROBES" >> ann_objs$GO2ALLPROBES <- map >> >> All this needs to happen at load-time (via the .onLoad hook). Again I'm >> focusing on the GO-specific part of the story here, assuming that you've >> already managed to create the non-GO specific maps (thanks to Gabor's >> document). >> >> Hope this helps, >> >> H. >> >>> >>> kind regards, Fabian >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> -- >> Hervé Pagès >> >> Program in Computational Biology >> Division of Public Health Sciences >> Fred Hutchinson Cancer Research Center >> 1100 Fairview Ave. N, M1-B514 >> P.O. Box 19024 >> Seattle, WA 98109-1024 >> >> E-mail: hpages at fhcrc.org >> Phone: (206) 667-5791 >> Fax: (206) 667-1319 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY

Login before adding your answer.

Traffic: 541 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6