Search
Question: Clusterprofiler - MSigDB gene set analysis - Updated
0
gravatar for thomasjenner333
4 weeks ago by
thomasjenner3330 wrote:

Hi,

I'm attempting to use 'enricher' and 'GSEA' functions from clusterprofiler package to analayze gene sets from MSigDB. 

The following is the code I'm using:

​> gmtfile <- "/path/c5.all.v6.1.entrez.gmt"
> c5 <- read.gmt(gmtfile)

> head(df)

   ENTREZID log2FoldChange

1 100516980     0.11587633

2 100155074     0.11587633

> egmt <- enricher(as.character(df[,1]), TERM2GENE=c5)

--> No gene can be mapped....

--> Expected input gene ID: 27433,10846,23479,3669,65977,10808

--> return NULL...

> head(geneList)

100154447    396596 100516171 100155895    397132 100515447

6.035077  4.837211  4.629196  4.524015  4.420449  4.401480

> egmt2 <- GSEA(geneList, TERM2GENE=c5, verbose=FALSE)

--> Expected input gene ID: 54932,23001,3329,2035,9837,22894

Error in check_gene_id(geneList, geneSets) :

  --> No gene can be mapped....

ANSWER: I got a list of pathways by doing the following

eg = bitr(d$SYMBOL, fromType="SYMBOL", toType=c("PATH", "ENTREZID"), OrgDb="org.Ss.eg.db")
> head(eg)
    SYMBOL  PATH  ENTREZID
1    ACKR1 05144 100154447
2     FMO1 00982    397132

tt <- eg[,c(2,3)]
> head(tt)
    PATH  ENTREZID
1  05144 100154447
2  00982    397132

> egmt <- enricher(as.vector(df[,1]), pvalueCutoff=1, qvalueCutoff=1, pAdjustMethod = "BH", TERM2GENE=tt)
> head(egmt)
ID Description GeneRatio BgRatio pvalue p.adjust qvalue
00010 00010       00010   32/2868 32/2868      1        1      1
00020 00020       00020   20/2868 20/2868      1        1      1

I'm not sure why the pathway description aren't displayed. Any suggestions? Thanks 

ADD COMMENTlink modified 4 weeks ago by Guangchuang Yu960 • written 4 weeks ago by thomasjenner3330
2
gravatar for Guangchuang Yu
4 weeks ago by
Hong Kong
Guangchuang Yu960 wrote:

What’s the organism you want to analyze? according to https://www.ncbi.nlm.nih.gov/gene/100516980, it is Sus scrofa.

I guess the gmtfile <- "/path/c5.all.v6.1.entrez.gmt" is annotation for human.

This is why it throw the msg:


--> No gene can be mapped....  
--> Expected input gene ID: 27433,10846,23479,3669,65977,10808

ADD COMMENTlink modified 4 weeks ago • written 4 weeks ago by Guangchuang Yu960

Hi Yu!

Yes, I had used the wrong file. Then I got the ones for Sus scrofa, and did the MSigdb gene set analysis. Thanks for pointing out the mistake. 

The answer that I've posted, is that an acceptable approach to get a list of pathways? Thanks for your help.

ADD REPLYlink written 4 weeks ago by thomasjenner3330
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 306 users visited in the last hour