Implementation of the "original" GSEA algorithm in R
1
0
Entering edit mode
nhaus • 0
@789c70a6
Last seen 1 day ago
Switzerland

Hi,

I often use the GSEA function of the clusterProfiler package downstream of my differential gene expression analysis. If I understand it correctly, clusterProfiler::GSEA uses fgsea under the hood to estimate significance levels and for that, it permutates the gene labels.

However, if I remember correctly, the "original" GSEA implementation (i.e. the one from Subramanian et al.), actually permutates the class labels to preserve gene-gene correlations.

So I was wondering, if there actually is an implementation of the "original" GSEA algorithm that can be called from R. I think the python package gseapy can do it for example.

Any pointers are much appreciated!

Cheers!

clusterProfiler GSEA fgsea • 294 views
1
Entering edit mode

I have never used these implementations myself, but I know the Broad Institute has released one:

https://github.com/GSEA-MSigDB/GSEA_R

Also the Biometrics Research Branch at the National Cancer Institute did so:

https://brb.nci.nih.gov/BRB-ArrayTools/ArrayToolsRPackages.html (bottom of website)

1
Entering edit mode

Thanks for the links, but note that

1. The Broad Institute did release R scripts for GSEA, but that was nearly 20 years ago. The scripts haven't been updated in 2005 and are not maintained or supported, see https://software.broadinstitute.org/cancer/software/gsea/wiki/index.php/R-GSEA_Readme . I have tried the 2005 R-GSEA scripts but found them so slow and memory hungry as to be essentially unusable. Most importantly, the 2005 Broad Institute R scripts are copyrighted in such a way that prevents Bioconductor package authors from copying or adapting the code from R-GSEA into a new package.

2. The https://github.com/GSEA-MSigDB/GSEA_R documentation says that it "remains unsupported by the GSEA-MSigDB Team".

3. BRB-ArrayTools does not make any claim, as far as I can see, that their GSEA tool is equivalent to that published by the Broad Institute. They don't cite any of the Broad Institute papers, which would suggest that it is not exactly equivalent. The BRB-ArrayTools manual recommends the use of the GSA package by Efron and Tibshirani as an improvement on the Broad Institute method.

0
Entering edit mode

I just found out about the romer function from the limma package and it says that it tests a hypothesis similar to that of Gene Set Enrichment Analysis (GSEA) (Subramanian et al, 2005), so I assume that it also controls for inter gene correlation. Am I right to assume that this is a reasonable alternative to the GSA package?

0
Entering edit mode

Yes, but I use and recommend camera() or cameraPR() instead of romer().

camera() and romer() both adjust for inter-gene correlations. But I prefer the purely competitive approach of camera() over the (difficult to interpret) combination of competitive and self-contained hypotheses that is tested by GSEA or romer().

3
Entering edit mode
assaron ▴ 240
@assaron
Last seen 6 weeks ago
St Petersburg

There is a label-permutation version in fgsea, the function is called fgseaLabel. The nominal p-values there are identical to the Broad's version when correlation is used for gene ranking. However, we never worked out how to to a multiple hypothesis correction: BH-adjusted p-values are never significant, and Broad GSEA have a procedure that approximates BH correction, so it's also not the great choice.

1
Entering edit mode

Also, it could be beneficial to take a look at camera method from limma package, it could be more appropriate to use in your case.