Include or exclude N/A genes (those without any KEGG annotation) when running Wilcox test?
0
0
Entering edit mode
mch • 0
@fa605d38
Last seen 13 months ago
United States

Hello there,

I am running a KEGG enrichment analysis but about 2500/14700 genes do not get converted to KEGG identifiers using keggConv. Presumably this is because they are not in the KEGG database.

Should we exclude those 2500 NA genes from the Wilcox test since those genes would always be considered "not in pathway" when comparing p-values to genes that are "in pathway". In an extreme case if those NA genes were all highly biased as significantly DE, that could dilute the impact of DE genes that are actually in pathways, potentially preventing those pathways from being significantly enriched. This makes me think those NA genes should be excluded...

Alternatively, we could consider those NA genes an essential part of the "baseline" transcriptome, of which the KEGG pathways and corresponding genes are also a component of... and therefore those NA genes are still needed to test for pathway enrichment. In this case, all genes, including those not in the KEGG database, should be included in the Wilcox test...

I was wondering if there is a standard or suggested "protocol" for handling this issue (genes that have no KEGG identifier)? Any insight would be greatly appreciated!

Thank you!

KEGGREST KEGG RNASeq • 580 views
ADD COMMENT

Login before adding your answer.

Traffic: 655 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6