Gene Symbol with Pathview
3
0
Entering edit mode
abhinavbn • 0
@abhinavbn-9000
Last seen 8.5 years ago
United States

I applied “Pathview” to overlay my gene expression data with a KEGG pathway (hsa05200). In this pathway, there are significant amount of genes and It seems one gene symbol on the graph can represent many genes. For instance, if i move cursor over “HDAC” on the graph, 2 gene symbols pop up (e.g HDAC1,HDAC2 ). I have expression values for these genes as HDAC1=–2.5 and HDAC2=3.2 and when I apply pathview I see HDAC1 in the Pathway but not HDAC2(since HDAC2 is largest).Is it a bug?How can I display a gene which has highest value when there are multiple genes?

 

pathview bioconductor kegg visualization • 2.6k views
ADD COMMENT
0
Entering edit mode
Luo Weijun ★ 1.6k
@luo-weijun-1783
Last seen 10 months ago
United States
This is not a bug. In page 10 of pathview tutorial: Note in native KEGG view, a gene node may represent multiple genes/proteins with similar or redundant functional role. The number of member genes range from 1 up to several tens. They are intentionally put together as a single node on pathway graphs for better clarity and readability. Therefore, we do not split node and mark each member genes separately by default. But rather we visualize the node-wise data by summarize gene-wise data, users may specify the summarization method using node.sum arguement. Poential options include "sum","mean", "median", "max", "max.abs" and "random". Default node.sum="sum", and you can use "max" in your case.
ADD COMMENT
0
Entering edit mode

Thank you for quick reply!!

yes,I added "max" in my case but i still see HDAC1 name in the KEGG graph box having the HDAC2 max value.I understand that node.sum is looking for "max" value when they have multiple genes but they don't print the gene name that has max value.

ADD REPLY
0
Entering edit mode
abhinavbn • 0
@abhinavbn-9000
Last seen 8.5 years ago
United States

Thank you for quick reply!!

yes,I added "max" in my case but i still see HDAC1 name in the KEGG graph box having the HDAC2 max value.I understand that node.sum is looking for "max" value when they have multiple genes but they don't print the gene name that has max value.

ADD COMMENT
0
Entering edit mode
Luo Weijun ★ 1.6k
@luo-weijun-1783
Last seen 10 months ago
United States
You are right. All nodes with multiple genes mapped are labeled with the most representative protein/gene name. we don’t use the gene names with the maximal expression level or change. this way make more sense for most summary methods other than "max", like "sum","mean", "median" etc.
ADD COMMENT
0
Entering edit mode

Thank you for the explanation!!

ADD REPLY

Login before adding your answer.

Traffic: 501 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6