FGSEA pre-ranked input in increasing order or decreasing
1
0
Entering edit mode
sropri • 0
@af8dfadb
Last seen 5 months ago
United States

Hi,

I have a question on how the pre-ranked list of gene values should be supplied. For example, I take my results of Differential Expression between treatment and control and create a new rank statistic that takes the -log10 of the FDR value and multiplies it by the sign of the log fold change. Then I order the vector from decreasing to increasing values so the downregulated genes are at the top while the upregulated genes are at the bottom. This is shown in the code below:


newRank_pvalueAndFC = -log10(myDEresults$FDR) * sign(myDEresults$logFC)
names(newRank_pvalueAndFC) = rownames(myDEresults)
newRank_pvalueAndFC = newRank_pvalueAndFC[order(newRank_pvalueAndFC)]

In other words, the downregulated genes with a really small fdr value are at the top and the upregulated genes with a really small fdr value are at the bottom. An example of my new rank vector looks like this

Gene1   -5.2
Gene2   -3.1
Gene3   -0.1
Gene4    1.2
Gene5    3.2

The question I have is, does the ordering of the pre ranked list matter. Meaning, the way NES is calculated, do the upregulated genes have to be at the top and downregulated at the bottom so that way if the NES is positive for a pathway it corresponds to the pathway having genes active or upregulated in treatment vs control? Or can the downregulated genes be at the top as I have them and gsea will take the sign of the genes into account and even though they are found at the top, the NES score will be negative for that pathway meaning the pathway has genes enriched that are down in treatment vs control? Also, does fgsea re rank the pre ranked vector I supply it? If it does, what is the benefit of supplying it a pre ranked vector?

I appreciate any help in this as am new to GSEA and using it for enrichment.

fgsea • 1.3k views
ADD COMMENT
0
Entering edit mode
alserg ▴ 280
@assaron
Last seen 4 months ago
St Louis, MO

The genes are automatically ordered within fgsea, there is no need to sort them manually.

ADD COMMENT
0
Entering edit mode

Thank you for your reply and I know there is another post similar to this you had mentioned this. How are the genes that I mentioned in the above example ordered, are they just reordered in decreasing order? And does fgsea take into account the sign in the ordered list when calculating NES, or are genes found at top have a positive NES whether they are downregulated or upregulated in DE? Appreciate your help in this.

ADD REPLY
0
Entering edit mode

They are reorder in the decreasing order.

ADD REPLY
0
Entering edit mode

Thank you!

ADD REPLY
0
Entering edit mode

I have a follow up question to this, does fgsea treat positive and negative numbers differently for enrichment score calculation? Or are the numbers simply used for sorting?

ADD REPLY
0
Entering edit mode

The numbers are only used for sorting; thus to create a ranked list, and are then 'ignored'. Thus your 2nd statement.

For background concepts see e.g.https://www.pathwaycommons.org/guide/primers/data_analysis/gsea/

ADD REPLY
0
Entering edit mode

Guido, that's not really true. The numbers are not ignored after ranking, but their absolute values are used in statistic calculation. Otherwise it would be just a Kolmogorov-Smirnov test. The values are only ignored when gseaParam=0 option is set, as after raising them to 0 power they all become 1, which does turn it into Kolmogorov-Smirnov.

ADD REPLY
0
Entering edit mode

The numbers are first used for ranking, then their absolute values are used in calculating GSEA statistic.

ADD REPLY

Login before adding your answer.

Traffic: 341 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6