Search
Question: Clusterprofiler - MSigDB gene set analysis - Updated
0
gravatar for thomasjenner333
9 months ago by
thomasjenner3330 wrote:

Hi,

I'm attempting to use 'enricher' and 'GSEA' functions from clusterprofiler package to analayze gene sets from MSigDB. 

The following is the code I'm using:

​> gmtfile <- "/path/c5.all.v6.1.entrez.gmt"
> c5 <- read.gmt(gmtfile)

> head(df)

   ENTREZID log2FoldChange

1 100516980     0.11587633

2 100155074     0.11587633

> egmt <- enricher(as.character(df[,1]), TERM2GENE=c5)

--> No gene can be mapped....

--> Expected input gene ID: 27433,10846,23479,3669,65977,10808

--> return NULL...

> head(geneList)

100154447    396596 100516171 100155895    397132 100515447

6.035077  4.837211  4.629196  4.524015  4.420449  4.401480

> egmt2 <- GSEA(geneList, TERM2GENE=c5, verbose=FALSE)

--> Expected input gene ID: 54932,23001,3329,2035,9837,22894

Error in check_gene_id(geneList, geneSets) :

  --> No gene can be mapped....

ANSWER: I got a list of pathways by doing the following

eg = bitr(d$SYMBOL, fromType="SYMBOL", toType=c("PATH", "ENTREZID"), OrgDb="org.Ss.eg.db")
> head(eg)
    SYMBOL  PATH  ENTREZID
1    ACKR1 05144 100154447
2     FMO1 00982    397132

tt <- eg[,c(2,3)]
> head(tt)
    PATH  ENTREZID
1  05144 100154447
2  00982    397132

> egmt <- enricher(as.vector(df[,1]), pvalueCutoff=1, qvalueCutoff=1, pAdjustMethod = "BH", TERM2GENE=tt)
> head(egmt)
ID Description GeneRatio BgRatio pvalue p.adjust qvalue
00010 00010       00010   32/2868 32/2868      1        1      1
00020 00020       00020   20/2868 20/2868      1        1      1

I'm not sure why the pathway description aren't displayed. Any suggestions? Thanks 

ADD COMMENTlink modified 9 months ago by Guangchuang Yu1.1k • written 9 months ago by thomasjenner3330
2
gravatar for Guangchuang Yu
9 months ago by
Guangchuang Yu1.1k
China/Guangzhou/Southern Medical University
Guangchuang Yu1.1k wrote:

What’s the organism you want to analyze? according to https://www.ncbi.nlm.nih.gov/gene/100516980, it is Sus scrofa.

I guess the gmtfile <- "/path/c5.all.v6.1.entrez.gmt" is annotation for human.

This is why it throw the msg:


--> No gene can be mapped....  
--> Expected input gene ID: 27433,10846,23479,3669,65977,10808

ADD COMMENTlink modified 9 months ago • written 9 months ago by Guangchuang Yu1.1k

Hi Yu!

Yes, I had used the wrong file. Then I got the ones for Sus scrofa, and did the MSigdb gene set analysis. Thanks for pointing out the mistake. 

The answer that I've posted, is that an acceptable approach to get a list of pathways? Thanks for your help.

ADD REPLYlink written 9 months ago by thomasjenner3330
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 157 users visited in the last hour