Question: Accounting for Gene Set Overlaps in Goana and Kegga Results
Is there a technique to resolve the correlation introduced into gene set results from having gene sets which have similar components? For example, if the top results are

##                                          Pathway   N  Up Down                P.Up P.Down
## path:hsa05169       Epstein-Barr virus infection 178  86   10 0.00000000000000084  1.00
## path:hsa05165     Human papillomavirus infection 289 114   34     0.0000000000012  1.00


it's probably because hsa05169 and hsa05165 share many components.

Answer: Accounting for Gene Set Overlaps in Goana and Kegga Results
If you are using Gene Ontology, then you could summarise your findings with tools such as REVIGO . The tool will try to find representative pathways based on semantic similarities (i.e. how similar the names are). Or you could also use emapplot in clusterProfiler to visualise the overlapping genesets.

Thanks for the suggestion. REVIGO works only for gene ontology, so won't be suited to KEGG pathways. emapplot is a good suggestion for visualising the problem but doesn't adjust the statistical modelling to correct for it.

1

Oh, now I understand. Try PADOG instead! I think the name is self-explanatory and might be the one that you're looking for.