Hello! I have a list of about 6400 differentially expressed genes and I want to perform KEGG pathways enrichment analysis using the clusterProfiler package.
I first did it with the ORA method, using the
enrichKEGG function and doing it separately on the genes with positive and negative logFC. I got hundreds of enriched pathways with this method, both for the up and downregulated genes.
I then tried to do it with the GSEA method using the
gseKEGG function (in this case of course I didn't split the + and - logFC values since the method already assigns an enrichment score with a sign). However, I only got 13 pathways as a result, half upregulated and half downregulated. I find it odd considering the amount of enriched pathways I had obtained with the other method. I am aware that the GSEA method usually returns less results that the ORA one but I had also analyzed my gene list using the GSEA desktop software and I had gotten about 20 upregulated and 20 downregulated pathways.
Why could it be that I'm getting so few enriched pathways using the gseKEGG function? should I just set a less astringent p value or is there something else I could try? I'm just using the default parameters but I'll leave my code here in case you need to see it.
pathways_GSEA <- gseKEGG(geneList = geneList, organism = 'hsa', pAdjustMethod = "BH", minGSSize = 15, maxGSSize = 500, nPermSimple = 10000, pvalueCutoff = 0.05) sessionInfo( ) R version 4.0.3 (2020-10-10) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 18363) Matrix products: default locale:  LC_COLLATE=Spanish_Argentina.1252 LC_CTYPE=Spanish_Argentina.1252  LC_MONETARY=Spanish_Argentina.1252 LC_NUMERIC=C  LC_TIME=Spanish_Argentina.1252