GO.db: searching for "top" GO term for list of terms
Dear all,

I am currently working with a list of GO terms linked to gene ID's.

For completeness like so, but then for 100s of genes:

GENE1 GO:xxxx GO:xxxxx GO:xxxxx

Because I want to group my genes into 'biological-logical' groups like: development, digestion, metabolition etc. I was thinking to get the relevant GO term to do so: the GO term just below 'biological process', e.g.: GO:0008152.

Is there a way with the GO.db package to do so? I have been trying the following:

> get("GO:0006508", GOBPANCESTOR)

But than I get all ancestor terms, is there a way to get only the '2nd from the top'? 

I guess I am not the first person who want to group genes based on very broad biological processes, I do not want to reinvent the wheel so if there is another -better- way, please let me know!

Thanks in advance!

I don't know if there is an easy way to do this using the existing infrastructure. There might be, but this is a bit different from the usual use case, so maybe not. But you can get what you want from a direct query to the underlying database. Let's say we want all the direct offspring for the GO term you mention (GO:0006508).

> library(GO.db)
> con <- GO_dbconn()
> library(DBI)
## we need to map GO term to database ID
> dbGetQuery(con, "select _id from go_term where go_id='GO:0006508';")
1 5442

> dbGetQuery(con, "select go_id, term from go_term inner join go_bp_parents using(_id) where _parent_id='5442';")
       go_id                                                       term
1 GO:0016485                                         protein processing
2 GO:0051603 proteolysis involved in cellular protein catabolic process
3 GO:0030162                                  regulation of proteolysis
4 GO:0033619                               membrane protein proteolysis
5 GO:0035897                              proteolysis in other organism
6 GO:0045861                         negative regulation of proteolysis
7 GO:0045862                         positive regulation of proteolysis
8 GO:0070646              protein modification by small protein removal
9 GO:0097264                                           self proteolysis

And those GO terms should be the direct child terms for GO:0006508, which is 'Proteolysis'.


