How to select the genes mapped to an enriched KEGG pathway (kegga)
1
0
Entering edit mode
mat149 ▴ 80
@mat149-11450
Last seen 22 days ago
United States

Hello,

 

I used the kegga function to gather pathway enrichment for my dataset.  I would like to obtain a data frame (or any possible format) containing which genes are "up" and "down" for a given kegg pathway. So for example, I want to select all genes from the "N" column, "up" column" and "down" column for the row "path:dre04020".

I would also like to do this for a gene ontology enrichment table returned from goana.  For example, I would like to select/extract all genes from the N/up/down columns for row "GO:0022613".

I tried using the select function but haven't had much luck. Does anyone know if this can be done?

 

Thanks,

Matt

 

    N Up Down P.Up P.Down
path:dre01100 Metabolic pathways 1074 291 27 4.29E-85 0.000174
path:dre04020 Calcium signaling pathway 216 70 4 3.03E-24 0.266749
path:dre01200 Carbon metabolism 114 47 2 1.27E-21 0.403579
path:dre04080 Neuroactive ligand-receptor interaction 313 77 13 1.99E-18 0.000105
path:dre00010 Glycolysis / Gluconeogenesis 70 32 1 1.02E-16 0.575509

 

   

Ont

N

Up

Down

P.Up

P.Down

GO:0022613

ribonucleoprotein complex biogenesis

BP

205

0

87

1

1.65E-60

GO:0034660

ncRNA metabolic process

BP

215

0

87

1

2.22E-58

GO:0005730

nucleolus

CC

132

0

68

1

1.93E-54

GO:0042254

ribosome biogenesis

BP

143

0

69

1

1.04E-52

GO:0034470

ncRNA processing

BP

163

0

72

1

1.45E-51

kegga • 4.8k views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 10 hours ago
WEHI, Melbourne, Australia

This is the sort of thing that we don't provide special functions for in limma, but which you can do reasonably easily for yourself if you are confident in using R.

Suppose you have done a limma linear model analysis resulting in a fit object. You might look at the topTable:

tab <- topTable(fit, n=Inf)

Now you want to a KEGG analysis. You could download the KEGG pathway annotation for zebra fish by:

GK <- getGeneKEGGLinks(species.KEGG = "dre")

Then you could do the KEGG analysis:

k <- kegga(fit, species.KEGG="dre", gene.pathway=GK)
topKEGG(k)

Now if you want to see the top-table genes that belong to a particular pathway (say path:dre01100), it is just a matter of using standard subseting operations:

i <- tab$GeneID %in% GK$GeneID[GK$PathwayID=="path:dre01100"]
tab[i,]

Edit:

The above code assumes your gene IDs are stored in the "GeneID" column. If you have stored them under a different column name, then you need to make the obvious change, such as:

i <- tab$ENTREZID %in% GK$GeneID[GK$PathwayID=="path:dre01100"]
tab[i,]

You might also need to use as.character(tab$ENTREZID) if you have stored the column as a number or as a factor.

ADD COMMENT
0
Entering edit mode

Hey Gordon, 

I tried your script but I had to modify a few things to run the KEGG pathway enrichment:

entzzvec<-as.vector(data.fit.eb$genes$ENTREZID)
tab <- topTable(data.fit.eb,lfc=1.5,adjust="BH",p.value=0.01, n=Inf)
GK <- getGeneKEGGLinks(species.KEGG = "dre")
k <- kegga(data.fit.eb,coef=1,geneid=entzzvec,species.KEGG="dre", gene.pathway=GK,FDR=0.01)
topKEGG(k)
i <- tab$GeneID %in% GK$GeneID[GK$PathwayID=="path:dre01100"]
tab[i,]

 

When I run the last line, it returns:

> tab[i,]
 [1] PROBEID            ID                 SYMBOL             GENENAME          
 [5] ENTREZID           morphant...control rescue...control   morphant...rescue 
 [9] AveExpr            F                  P.Value            adj.P.Val         
<0 rows> (or 0-length row.names)

 

I am not sure what to do from here, am I getting close?

 

ADD REPLY
0
Entering edit mode

You are using tab$GeneID when you don't actually have such a column. So naturally i contains nothing and tab[i,] has zero rows. See my edit above.

ADD REPLY
0
Entering edit mode

Haha thanks for pointing that out (as.character(tab$ENTREZID)), I missed it completely :X

this works, thank you!

ADD REPLY
0
Entering edit mode

Hey,

 

I have been struggling to emulate this with GO categories.  I am stuck b/c I do not know how to generate an "R" object containing all GO term annotations for "Dr" analogous to: GK <- getGeneKEGGLinks(species.KEGG = "dre"). Beyond that, would i <- tab$GeneID %in% GK$GeneID[GK$PathwayID=="path:dre01100"] then need to be changed to: i <- tab$GeneID %in% GK$GeneID[GK$TermID=="GO:0004984"] for gene ontologies?

 

 

Here is an example of five categories I am interested in.

  Term Ont N Up Down P.Up P.Down
GO:0004984 olfactory receptor activity MF 102 48 0 1.15E-30 1
GO:0050877 neurological system process BP 258 63 47 3.42E-21 0.010118
GO:0007601 visual perception BP 61 2 32 0.905847 1.7E-13
GO:0008066 glutamate receptor activity MF 22 0 18 1 4.65E-13
GO:0045202 synapse CC 167 2 68 0.999791 2.59E-19

 

entzzvec<-as.vector(data.fit.eb$genes$ENTREZID)
tab1 <- topTable(data.fit.eb,lfc=1.5,adjust="BH",p.value=0.01, n=Inf,coef=1)
??? GG<-GET GO TERM ANNOTATIONS ???
MOgo <- goana(data.fit.eb, coef=1,FDR=0.01,geneid=data.fit.eb$genes$ENTREZID,species="Dr")
topMOgo<-topGO(MOgo,number=50)
i <- tab1$GeneID %in% GG$GeneID[GG$TermID=="GO:0004984"]
tab1[i,]

 

 

 

 

 

Thanks,

Matt

ADD REPLY

Login before adding your answer.

Traffic: 794 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6