Question

Mapping gene IDs to gene symbol of KEGG pathways with "Pathview"

0

Entering edit mode

alakatos ▴ 130

@alakatos-6983

Last seen 4.6 years ago

United States

Hello,

I applied “Pathview” to overlay my gene expression data (“limma” - logFC) with a KEGG pathway (TNF-alpha). In this pathway, there are 111 genes (http://www.genome.jp/dbget-bin/get_linkdb?-t+genes+path:mmu04668). However, not all the genes are included in the KEGG graphics ((http://www.genome.jp/kegg-bin/show_pathway?mmu04668). It seems one gene symbol on the graph can represent many genes from the pathway. For instance, if you move your coursor over “PI3K” on the graph, several gene symbols pop up (e.g Pik3ca, Pik3cd, Pik3r1, Pik3r2, Pik3r3, Pik3r5, Pik3cg, Pik3cb) but if you click on the gene symbol it takes you to a link indicating only one gene Pik3ca. 109 genes of TNFa pathway are represented on my gene expression array including the 8 “PIK3” genes: 2 of those 8 genes with negative logFC , 6 of them with positive logFC.

Q1.I am wondering what/how values and genes are signified by each symbol on the graph made by “Pathview” when the gene symbol has more than one representation on the array (= gene.data for “Pathview”) like “PIK3” as an example. I was searching for an answer on the KEGG website or in relevant articles but I have not found any clear answer to my question.

Q2.I would like to change the color-coding of the gene symbols on the "Pathview" graph. I would like to represent all the negative values(- logFC) with one color (green palette (bright green to dark green) shade and all the positive values (+logFC) with one color (red palette (dark red to bright red) shade divided at 0, no gray in the middle. I tried different combinations of colors and bins but I could not get the division exactly in the middle (at 0). In addition, when I increase the bin substantially, the color key is off the chart.

Code:

pv.out <- pathview(gene.data = limma_result[, 1] , pathway.id =”04668 “, species = "mmu", out.suffix = "TNFa", keys.align = "y", kegg.native = T, both.dirs = TRUE, low = "green", mid = "gray", high = "red", bin = 20)

Any help is highly appreciated.

Thank you in advacne.

Anita

Pathview KEGG annotation gene symbol visualization • 3.4k views

ADD COMMENT • link updated 8.9 years ago by Luo Weijun ★ 1.6k • written 8.9 years ago by alakatos ▴ 130

score 0 · Answer 1 · 2015-06-16

When I click on each gene node, I get to a info page where all member genes are listed. In your case, I saw Pik3ca, Pik3cd, Pik3r1… (you do a on-page find). For your Q1 - Page 10 of Pathview vignette: “Note in native KEGG view, a gene node may represent multiple genes/proteins with similar or redundant functional role. The number of member genes range from 1 up to several tens. They are intentionally put together as a single node on pathway graphs for better clarity and readability. Therefore, we do not split node and mark each member genes separately by default. But rather we visualize the node-wise data by summarize gene-wise data, users may specify the summarization method using node.sum arguement.” Check node.sum arguement in pathview function for more details. For your Q2, Pathview currently only implement a continuous color spectrum with 3 point control by default, and 2 point control for when all data value is 1 direction (positive or negative) and both.dir=F.