How many of my genes from my gene list are in each KEGG pathway?
1
0
Entering edit mode
@41ccc2f8
Last seen 6 months ago
United States

Hi! I have a gene list, ranked by log2FoldChange for the input of gseKEGG to return a kegg object or data frame that has info on enriched pathways. However, I would like to know how many of my genes from my gene list are in each enriched pathway. How can I do this? Thank you!

Pathways clusterProfiler gseKEGG KEGG • 709 views
ADD COMMENT
2
Entering edit mode
Guido Hooiveld ★ 4.1k
@guido-hooiveld-2020
Last seen 22 hours ago
Wageningen University, Wageningen, the …

You can compare the number in the column setSize versus the number of genes in core_enrichment. The former corresponds to the number of genes in a gene set (pathway), and the latter to the number of core enrichment genes a.k.a leading edge genes. These are the genes that contribute to the enrichment of the gene set. "The leading edge subset of a gene set is the subset of members that contribute most to the ES. For a positive ES (such as the one shown here), the leading edge subset is the set of members that appear in the ranked list prior to the peak score. For a negative ES, it is the set of members that appear subsequent to the peak score". (from: https://www.gsea-msigdb.org/gsea/doc/GSEAUserGuideFrame.html)

Also this info may be helpful: https://yulab-smu.top/biomedical-knowledge-mining-book/faq.html#how-to-extract-genes-of-a-specific-termpathway and https://github.com/YuLab-SMU/clusterProfiler/issues/103#issuecomment-338035194.

ADD COMMENT
0
Entering edit mode

awesome thank you, that's really helpful!

ADD REPLY
0
Entering edit mode

You can compare the number in the column setSize versus the number of genes in core_enrichment. The former corresponds to the number of genes in a gene set (pathway), and the latter to the number of core enrichment genes a.k.a leading edge genes. These are the genes that contribute to the enrichment of the gene set. "The leading edge subset of a gene set is the subset of members that contribute most to the ES. For a positive ES (such as the one shown here), the leading edge subset is the set of members that appear in the ranked list prior to the peak score. For a negative ES, it is the set of members that appear subsequent to the peak score". (from: https://www.gsea-msigdb.org/gsea/doc/GSEAUserGuideFrame.html)

Also this info may be helpful: https://yulab-smu.top/biomedical-knowledge-mining-book/faq.html#how-to-extract-genes-of-a-specific-termpathway and https://github.com/YuLab-SMU/clusterProfiler/issues/103#issuecomment-338035194.

Thanks for sharing!

ADD REPLY

Login before adding your answer.

Traffic: 1012 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6