Question: ReactomePA / Reactome: GSEA error message
12 months ago
martin.busch wrote:

Hi everyone,

I am sorry to ask another question, however, there is an error message that keeps me puzzled. When passing over a list of human entrez IDs to reactomePA for GSEA using

result <- gsePathway(anaData, nPerm=10000, pvalueCutoff=0.2, pAdjustMethod="BH", verbose=FALSE)

Rstudio becomes busy and cannot finish computation. When I manually stop it I get the error message:

Warning message:
In fgsea(pathways = geneSets, stats = geneList, nperm = nPerm, minSize = minGSSize,  :
  There are duplicate gene names, fgsea may produce unexpected results

How can I pass over parameters like maxSize=500 and which parameter can I use to avoid duplicate gene names, although the entrez IDs are unique? Seems like the mapping yields duplicate gene names?!

Thank you so much in advance for your help,



P.S: Input look like this

> head(anaData,10)
    1301     3371     4069    57537    11081     5764   114899     2331     1303     7060 
6.198340 4.505550 3.962765 3.753962 3.461323 3.148910 3.075820 3.034261 3.010098 2.880258 
> length(anaData)
[1] 11317
modified 12 months ago • written 12 months ago by martin.busch

Could you also paste the resul of any(duplicated(names(anaData)))? This is what is checked at fgsea.

assaron wrote:

Thank you so much for your comment - in fact I was pretty suprised to see that the result was true - something that should not have happened. There was some mapping involved and it seems that multiple ensembl IDs can be mapped to one entrez ID. I thought that I had this sorted out. Anyways, not it works pretty fine! Thanks a lot!

written 12 months ago by martin.busch
