When I was doing GO and KEGG enrichment, some weird things happened. I used all the differential genes to do the enrichment, only 6.99% of input gene IDS are fail to map. But when I separated the upregulated genes and downregulated genes from all, and did the enrichment separately, 8.03% of input gene IDS are fail to map for downregulated genes, 8.09% of input gene IDS are fail to map for upregulated genes. Why the gene IDS mapping rates decreased after I did separately? And the enrichments are different between the all genes and upregulated/downregulated genes. The codes are below as following.
Code should be placed in three backticks as shown below
library(openxlsx)
library(ggplot2)
library(stringr)
library(enrichplot)
library(clusterProfiler)
library(GOplot)
library(DOSE)
library(ggnewscale)
library(topGO)
library(circlize)
library(ComplexHeatmap)
library(org.Hs.eg.db)
info <- read.csv( "Up_genes.csv")
GO_database <- 'org.Hs.eg.db'
KEGG_database <- 'hsa'
gene <- bitr(info$gene_id,fromType = 'SYMBOL',toType = 'ENTREZID',OrgDb = GO_database)
GO<-enrichGO( gene$ENTREZID,OrgDb = GO_database,keyType = "ENTREZID", ont = "ALL",pvalueCutoff
= 0.05,qvalueCutoff = 0.05, readable = T)
KEGG<-enrichKEGG(gene$ENTREZID,organism = KEGG_database,pvalueCutoff = 0.05,qvalueCutoff
=0.05)
barplot(KEGG,showCategory = 30,title = 'KEGG Pathway')
barplot(GO, split="ONTOLOGY")+facet_grid(ONTOLOGY~., scale="free")
barplot(KEGG,showCategory = 30,title = 'KEGG Pathway',font.size=8.0, label_format = 45)
dotplot(GO, split="ONTOLOGY")+facet_grid(ONTOLOGY~., scale="free")
dotplot(KEGG,showCategory = 30,font.size=8.0, label_format = 45)
Many thanks in advance!