Dear List
I am working with Atlantic salmon and am highly interested to make a
custom annotation package for the
microarray that I am using.
I've worked through the tutorial from Gabor Csardi ("Creating an
annotation package with a new database
schema" ), which was very helpful. However, I am struggling to
implement the bimap objects to access
the GO annotations that I have in the DB.
The GO data is stored in 6 tables (BP, BP_all, MF, MF_all, CC, CC_all)
looking like the format that I found for
the organism packages in BioC:
ID GOID evi
6092 GO:0000910 IEA
6092 GO:0040035 IEA
6092 GO:0000398 IEA
So if someone could help me/ point me to the correct way how to
implement the GO mappings
in an annotation package that would be great.
kind regards, Fabian
Hi Fabian,
Have you seen this function (it's in AnnotationForge)?
?makeOrgPackageFromNCBI
Marc
On 10/22/2012 05:57 AM, Fabian Grammes wrote:
> Dear List
>
> I am working with Atlantic salmon and am highly interested to make a
> custom annotation package for the
> microarray that I am using.
>
> I've worked through the tutorial from Gabor Csardi ("Creating an
> annotation package with a new database
> schema" ), which was very helpful. However, I am struggling to
> implement the bimap objects to access
> the GO annotations that I have in the DB.
>
> The GO data is stored in 6 tables (BP, BP_all, MF, MF_all, CC,
CC_all)
> looking like the format that I found for
> the organism packages in BioC:
> ID GOID evi
> 6092 GO:0000910 IEA
> 6092 GO:0040035 IEA
> 6092 GO:0000398 IEA
>
> So if someone could help me/ point me to the correct way how to
> implement the GO mappings
> in an annotation package that would be great.
>
> kind regards, Fabian
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
Hi Fabian,
On 10/22/2012 05:57 AM, Fabian Grammes wrote:
> Dear List
>
> I am working with Atlantic salmon and am highly interested to make a
> custom annotation package for the
> microarray that I am using.
>
> I've worked through the tutorial from Gabor Csardi ("Creating an
> annotation package with a new database
> schema" ), which was very helpful. However, I am struggling to
implement
> the bimap objects to access
> the GO annotations that I have in the DB.
>
> The GO data is stored in 6 tables (BP, BP_all, MF, MF_all, CC,
CC_all)
> looking like the format that I found for
> the organism packages in BioC:
> ID GOID evi
> 6092 GO:0000910 IEA
> 6092 GO:0040035 IEA
> 6092 GO:0000398 IEA
>
> So if someone could help me/ point me to the correct way how to
> implement the GO mappings
> in an annotation package that would be great.
If you look for example at the hgu95av2.db package, it provides 3
predefined Bimaps for accessing the GO data: hgu95av2GO (GO map),
hgu95av2GO2PROBE (GO2PROBE map), and hgu95av2GO2ALLPROBES
(GO2ALLPROBES
map). The 1st is a direct map, the 2nd and 3rd are reverse maps:
> direction(hgu95av2GO)
[1] 1
> direction(hgu95av2GO2PROBE)
[1] -1
> direction(hgu95av2GO2ALLPROBES)
[1] -1
All of them are of class "ProbeGo3AnnDbBimap".
The predefined Bimaps are created at load-time. The direct maps
with a call to AnnotationDbi:::createAnnDbBimaps() and the reverse
maps by "manually" reversing some of the direct maps returned by
createAnnDbBimaps().
So you need to add an entry for the GO map to the list of "seeds"
passed to createAnnDbBimaps(). In your case this entry needs to look
something like (assuming ID is your internal id for genes):
seeds <- list(
...
list(
objName="GO",
Class="ProbeGo3AnnDbBimap",
L2Rchain=list(
list(
tablename="probes",
Lcolname="probe_id",
Rcolname="gene_id",
filter="{is_multiple}='0'"
),
list(
tablename="genes",
Lcolname="gene_id",
Rcolname="ID"
),
list(
Lcolname="ID",
tagname=c(Evidence="{evi}"),
Rcolname="GOID",
Rattribnames=c(Ontology="NULL")
)
),
rightTables=c(BP="BP", CC="CC", MF="MF")
)
...
)
Then:
ann_objs <- createAnnDbBimaps(seeds, seed0)
where 'seed0' is defined by something like:
seed0 <- list(objTarget="chip <name_of_your_chip>",
datacache=datacache)
and 'datacache' is the environment that will be used for package-level
caching of the data loaded from the DB (use NULL for no caching, I'm
assuming those extra details, which are not GO-specific, are covered
in Gabor's document, but I don't know).
Then you can append the reverse maps to 'ann_objs' with something
like:
## Append GO2PROBE map:
map <- ann_objs$GO
map <- revmap(map)
map at objName <- "GO2PROBE"
ann_objs$GO2PROBE <- map
## Append GO2ALLPROBES map:
map <- ann_objs$GO2PROBE
map at rightTables <- c(BP="BP_all", CC="CC_all", MF="MF_all")
map at objName <- "GO2ALLPROBES"
ann_objs$GO2ALLPROBES <- map
All this needs to happen at load-time (via the .onLoad hook). Again
I'm
focusing on the GO-specific part of the story here, assuming that
you've
already managed to create the non-GO specific maps (thanks to Gabor's
document).
Hope this helps,
H.
>
> kind regards, Fabian
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
Hi Herv?
Thanks a lot, that was exactly the information that I've been
looking for !
After updating BioConductor today, I am struggling a bit with
getting the code to work again, but that should be fixed tomorrow
I hope :)
@ Marc
I've checked the function: makeOrgPackageFromNCBI,
however since I have most of my annotation information stored locally
(GO etc. - obtained via Blast2GO) and not yet available at NCBI,
I do not think the function helps in my case.
cheers, F
On Oct 23, 2012, at 12:04 AM, Hervé Pagès wrote:
> Hi Fabian,
>
> On 10/22/2012 05:57 AM, Fabian Grammes wrote:
>> Dear List
>>
>> I am working with Atlantic salmon and am highly interested to make
a
>> custom annotation package for the
>> microarray that I am using.
>>
>> I've worked through the tutorial from Gabor Csardi ("Creating an
>> annotation package with a new database
>> schema" ), which was very helpful. However, I am struggling to
>> implement
>> the bimap objects to access
>> the GO annotations that I have in the DB.
>>
>> The GO data is stored in 6 tables (BP, BP_all, MF, MF_all, CC,
>> CC_all)
>> looking like the format that I found for
>> the organism packages in BioC:
>> ID GOID evi
>> 6092 GO:0000910 IEA
>> 6092 GO:0040035 IEA
>> 6092 GO:0000398 IEA
>>
>> So if someone could help me/ point me to the correct way how to
>> implement the GO mappings
>> in an annotation package that would be great.
>
> If you look for example at the hgu95av2.db package, it provides 3
> predefined Bimaps for accessing the GO data: hgu95av2GO (GO map),
> hgu95av2GO2PROBE (GO2PROBE map), and hgu95av2GO2ALLPROBES
> (GO2ALLPROBES
> map). The 1st is a direct map, the 2nd and 3rd are reverse maps:
>
> > direction(hgu95av2GO)
> [1] 1
> > direction(hgu95av2GO2PROBE)
> [1] -1
> > direction(hgu95av2GO2ALLPROBES)
> [1] -1
>
> All of them are of class "ProbeGo3AnnDbBimap".
>
> The predefined Bimaps are created at load-time. The direct maps
> with a call to AnnotationDbi:::createAnnDbBimaps() and the reverse
> maps by "manually" reversing some of the direct maps returned by
> createAnnDbBimaps().
>
> So you need to add an entry for the GO map to the list of "seeds"
> passed to createAnnDbBimaps(). In your case this entry needs to look
> something like (assuming ID is your internal id for genes):
>
> seeds <- list(
> ...
> list(
> objName="GO",
> Class="ProbeGo3AnnDbBimap",
> L2Rchain=list(
> list(
> tablename="probes",
> Lcolname="probe_id",
> Rcolname="gene_id",
> filter="{is_multiple}='0'"
> ),
> list(
> tablename="genes",
> Lcolname="gene_id",
> Rcolname="ID"
> ),
> list(
> Lcolname="ID",
> tagname=c(Evidence="{evi}"),
> Rcolname="GOID",
> Rattribnames=c(Ontology="NULL")
> )
> ),
> rightTables=c(BP="BP", CC="CC", MF="MF")
> )
> ...
> )
>
> Then:
>
> ann_objs <- createAnnDbBimaps(seeds, seed0)
>
> where 'seed0' is defined by something like:
>
> seed0 <- list(objTarget="chip <name_of_your_chip>",
> datacache=datacache)
>
> and 'datacache' is the environment that will be used for package-
level
> caching of the data loaded from the DB (use NULL for no caching, I'm
> assuming those extra details, which are not GO-specific, are covered
> in Gabor's document, but I don't know).
>
> Then you can append the reverse maps to 'ann_objs' with something
> like:
>
> ## Append GO2PROBE map:
> map <- ann_objs$GO
> map <- revmap(map)
> map at objName <- "GO2PROBE"
> ann_objs$GO2PROBE <- map
>
> ## Append GO2ALLPROBES map:
> map <- ann_objs$GO2PROBE
> map at rightTables <- c(BP="BP_all", CC="CC_all", MF="MF_all")
> map at objName <- "GO2ALLPROBES"
> ann_objs$GO2ALLPROBES <- map
>
> All this needs to happen at load-time (via the .onLoad hook). Again
> I'm
> focusing on the GO-specific part of the story here, assuming that
> you've
> already managed to create the non-GO specific maps (thanks to
Gabor's
> document).
>
> Hope this helps,
>
> H.
>
>>
>> kind regards, Fabian
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpages at fhcrc.org
> Phone: (206) 667-5791
> Fax: (206) 667-1319
Hi Fabian,
If the data for GO is not available at NCBI, makeOrgPackageFromNCBI
will
try to use blast2GO instead (for GO at least).
Marc
On 10/23/2012 01:20 PM, Fabian Grammes wrote:
> Hi Herv?
>
> Thanks a lot, that was exactly the information that I've been
> looking for !
>
> After updating BioConductor today, I am struggling a bit with
> getting the code to work again, but that should be fixed tomorrow
> I hope :)
>
> @ Marc
>
> I've checked the function: makeOrgPackageFromNCBI,
> however since I have most of my annotation information stored
locally
> (GO etc. - obtained via Blast2GO) and not yet available at NCBI,
> I do not think the function helps in my case.
>
> cheers, F
>
> On Oct 23, 2012, at 12:04 AM, Hervé Pagès wrote:
>
>> Hi Fabian,
>>
>> On 10/22/2012 05:57 AM, Fabian Grammes wrote:
>>> Dear List
>>>
>>> I am working with Atlantic salmon and am highly interested to make
a
>>> custom annotation package for the
>>> microarray that I am using.
>>>
>>> I've worked through the tutorial from Gabor Csardi ("Creating an
>>> annotation package with a new database
>>> schema" ), which was very helpful. However, I am struggling to
>>> implement
>>> the bimap objects to access
>>> the GO annotations that I have in the DB.
>>>
>>> The GO data is stored in 6 tables (BP, BP_all, MF, MF_all, CC,
CC_all)
>>> looking like the format that I found for
>>> the organism packages in BioC:
>>> ID GOID evi
>>> 6092 GO:0000910 IEA
>>> 6092 GO:0040035 IEA
>>> 6092 GO:0000398 IEA
>>>
>>> So if someone could help me/ point me to the correct way how to
>>> implement the GO mappings
>>> in an annotation package that would be great.
>>
>> If you look for example at the hgu95av2.db package, it provides 3
>> predefined Bimaps for accessing the GO data: hgu95av2GO (GO map),
>> hgu95av2GO2PROBE (GO2PROBE map), and hgu95av2GO2ALLPROBES
(GO2ALLPROBES
>> map). The 1st is a direct map, the 2nd and 3rd are reverse maps:
>>
>> > direction(hgu95av2GO)
>> [1] 1
>> > direction(hgu95av2GO2PROBE)
>> [1] -1
>> > direction(hgu95av2GO2ALLPROBES)
>> [1] -1
>>
>> All of them are of class "ProbeGo3AnnDbBimap".
>>
>> The predefined Bimaps are created at load-time. The direct maps
>> with a call to AnnotationDbi:::createAnnDbBimaps() and the reverse
>> maps by "manually" reversing some of the direct maps returned by
>> createAnnDbBimaps().
>>
>> So you need to add an entry for the GO map to the list of "seeds"
>> passed to createAnnDbBimaps(). In your case this entry needs to
look
>> something like (assuming ID is your internal id for genes):
>>
>> seeds <- list(
>> ...
>> list(
>> objName="GO",
>> Class="ProbeGo3AnnDbBimap",
>> L2Rchain=list(
>> list(
>> tablename="probes",
>> Lcolname="probe_id",
>> Rcolname="gene_id",
>> filter="{is_multiple}='0'"
>> ),
>> list(
>> tablename="genes",
>> Lcolname="gene_id",
>> Rcolname="ID"
>> ),
>> list(
>> Lcolname="ID",
>> tagname=c(Evidence="{evi}"),
>> Rcolname="GOID",
>> Rattribnames=c(Ontology="NULL")
>> )
>> ),
>> rightTables=c(BP="BP", CC="CC", MF="MF")
>> )
>> ...
>> )
>>
>> Then:
>>
>> ann_objs <- createAnnDbBimaps(seeds, seed0)
>>
>> where 'seed0' is defined by something like:
>>
>> seed0 <- list(objTarget="chip <name_of_your_chip>",
>> datacache=datacache)
>>
>> and 'datacache' is the environment that will be used for package-
level
>> caching of the data loaded from the DB (use NULL for no caching,
I'm
>> assuming those extra details, which are not GO-specific, are
covered
>> in Gabor's document, but I don't know).
>>
>> Then you can append the reverse maps to 'ann_objs' with something
like:
>>
>> ## Append GO2PROBE map:
>> map <- ann_objs$GO
>> map <- revmap(map)
>> map at objName <- "GO2PROBE"
>> ann_objs$GO2PROBE <- map
>>
>> ## Append GO2ALLPROBES map:
>> map <- ann_objs$GO2PROBE
>> map at rightTables <- c(BP="BP_all", CC="CC_all", MF="MF_all")
>> map at objName <- "GO2ALLPROBES"
>> ann_objs$GO2ALLPROBES <- map
>>
>> All this needs to happen at load-time (via the .onLoad hook). Again
I'm
>> focusing on the GO-specific part of the story here, assuming that
you've
>> already managed to create the non-GO specific maps (thanks to
Gabor's
>> document).
>>
>> Hope this helps,
>>
>> H.
>>
>>>
>>> kind regards, Fabian
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>> --
>> Hervé Pagès
>>
>> Program in Computational Biology
>> Division of Public Health Sciences
>> Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N, M1-B514
>> P.O. Box 19024
>> Seattle, WA 98109-1024
>>
>> E-mail: hpages at fhcrc.org
>> Phone: (206) 667-5791
>> Fax: (206) 667-1319
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor