clusterProfiler - difference in the number of genes from input and enrichGO results
1
0
Entering edit mode
@jane-merlevede-5019
Last seen 6.1 years ago

Hello,

I wonder why, in the results of enrichGO(), the total number of genes mentionned in GeneRatio is different from the number of genes in my input?

 

In one analysis, I have 858 ENTREZID genes in input. After running enrichGO()

GOenrichment_BP = enrichGO(gene_id$ENTREZID, OrgDb=org.Hs.eg.db, keytype = "ENTREZID", ont = "BP", pvalueCutoff = 0.001, pAdjustMethod = "BH", qvalueCutoff = 0.001, minGSSize = 5, maxGSSize = 900, readable = TRUE)

I get the first 2 lines:

ID    Description    GeneRatio    BgRatio
GO:0070482    response to oxygen levels    46/740    301/16672
GO:0036293    response to decreased oxygen levels    44/740    285/16672

Why are there now 740 genes?

Maybe I am missing something obvious here. It would be nice to have some explanation.

Thank you

clusterProfiler • 3.2k views
ADD COMMENT
1
Entering edit mode
Guangchuang Yu ★ 1.2k
@guangchuang-yu-5419
Last seen 22 days ago
China/Guangzhou/Southern Medical Univer…

Two cases:

1. If your input gene id contains duplicated IDs, those duplicated will be removed.

2. Those genes that do not have GO annotation will be removed.

 

 

ADD COMMENT
0
Entering edit mode

In the case of this analysis, I have no duplicated ID. So 14% of my ENTREZID genes do not have a GO annotation.

Is it a classic percentage? What do you get usually?

ADD REPLY

Login before adding your answer.

Traffic: 795 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6