Question

Gene Symbol with Pathview

0

Entering edit mode

abhinavbn • 0

@abhinavbn-9000

Last seen 9.1 years ago

United States

I applied “Pathview” to overlay my gene expression data with a KEGG pathway (hsa05200). In this pathway, there are significant amount of genes and It seems one gene symbol on the graph can represent many genes. For instance, if i move cursor over “HDAC” on the graph, 2 gene symbols pop up (e.g HDAC1,HDAC2 ). I have expression values for these genes as HDAC1=–2.5 and HDAC2=3.2 and when I apply pathview I see HDAC1 in the Pathway but not HDAC2(since HDAC2 is largest).Is it a bug?How can I display a gene which has highest value when there are multiple genes?

pathview bioconductor kegg visualization • 3.0k views

ADD COMMENT • link updated 9.1 years ago by Luo Weijun ★ 1.6k • written 9.1 years ago by abhinavbn • 0

score 0 · Answer 1 · 2015-10-16

0

Entering edit mode

Luo Weijun ★ 1.6k

@luo-weijun-1783

Last seen 17 months ago

United States

This is not a bug. In page 10 of pathview tutorial: Note in native KEGG view, a gene node may represent multiple genes/proteins with similar or redundant functional role. The number of member genes range from 1 up to several tens. They are intentionally put together as a single node on pathway graphs for better clarity and readability. Therefore, we do not split node and mark each member genes separately by default. But rather we visualize the node-wise data by summarize gene-wise data, users may specify the summarization method using node.sum arguement. Poential options include "sum","mean", "median", "max", "max.abs" and "random". Default node.sum="sum", and you can use "max" in your case.

ADD COMMENT • link 9.1 years ago Luo Weijun ★ 1.6k

0

Entering edit mode

Thank you for quick reply!!

yes,I added "max" in my case but i still see HDAC1 name in the KEGG graph box having the HDAC2 max value.I understand that node.sum is looking for "max" value when they have multiple genes but they don't print the gene name that has max value.

ADD REPLY • link 9.1 years ago abhinavbn • 0

score 0 · Answer 2 · 2015-10-18

0

Entering edit mode

abhinavbn • 0

@abhinavbn-9000

Last seen 9.1 years ago

United States

Thank you for quick reply!!

yes,I added "max" in my case but i still see HDAC1 name in the KEGG graph box having the HDAC2 max value.I understand that node.sum is looking for "max" value when they have multiple genes but they don't print the gene name that has max value.

ADD COMMENT • link 9.1 years ago abhinavbn • 0

score 0 · Answer 3 · 2015-10-19

0

Entering edit mode

Luo Weijun ★ 1.6k

@luo-weijun-1783

Last seen 17 months ago

United States

You are right. All nodes with multiple genes mapped are labeled with the most representative protein/gene name. we don’t use the gene names with the maximal expression level or change. this way make more sense for most summary methods other than "max", like "sum","mean", "median" etc.