Clusterprofiler - MSigDB gene set analysis - Updated
1
0
Entering edit mode
@thomasjenner333-15064
Last seen 6.9 years ago

Hi,

I'm attempting to use 'enricher' and 'GSEA' functions from clusterprofiler package to analayze gene sets from MSigDB. 

The following is the code I'm using:

​> gmtfile <- "/path/c5.all.v6.1.entrez.gmt"
> c5 <- read.gmt(gmtfile)

> head(df)

   ENTREZID log2FoldChange

1 100516980     0.11587633

2 100155074     0.11587633

> egmt <- enricher(as.character(df[,1]), TERM2GENE=c5)

--> No gene can be mapped....

--> Expected input gene ID: 27433,10846,23479,3669,65977,10808

--> return NULL...

> head(geneList)

100154447    396596 100516171 100155895    397132 100515447

6.035077  4.837211  4.629196  4.524015  4.420449  4.401480

> egmt2 <- GSEA(geneList, TERM2GENE=c5, verbose=FALSE)

--> Expected input gene ID: 54932,23001,3329,2035,9837,22894

Error in check_gene_id(geneList, geneSets) :

  --> No gene can be mapped....

ANSWER: I got a list of pathways by doing the following

eg = bitr(d$SYMBOL, fromType="SYMBOL", toType=c("PATH", "ENTREZID"), OrgDb="org.Ss.eg.db")
> head(eg)
    SYMBOL  PATH  ENTREZID
1    ACKR1 05144 100154447
2     FMO1 00982    397132

tt <- eg[,c(2,3)]
> head(tt)
    PATH  ENTREZID
1  05144 100154447
2  00982    397132

> egmt <- enricher(as.vector(df[,1]), pvalueCutoff=1, qvalueCutoff=1, pAdjustMethod = "BH", TERM2GENE=tt)
> head(egmt)
ID Description GeneRatio BgRatio pvalue p.adjust qvalue
00010 00010       00010   32/2868 32/2868      1        1      1
00020 00020       00020   20/2868 20/2868      1        1      1

I'm not sure why the pathway description aren't displayed. Any suggestions? Thanks 

clusterprofiler GSEA MSIGDB • 4.7k views
ADD COMMENT
2
Entering edit mode
Guangchuang Yu ★ 1.2k
@guangchuang-yu-5419
Last seen 8 weeks ago
China/Guangzhou/Southern Medical Univer…

What’s the organism you want to analyze? according to https://www.ncbi.nlm.nih.gov/gene/100516980, it is Sus scrofa.

I guess the gmtfile <- "/path/c5.all.v6.1.entrez.gmt" is annotation for human.

This is why it throw the msg:


--> No gene can be mapped....  
--> Expected input gene ID: 27433,10846,23479,3669,65977,10808

ADD COMMENT
0
Entering edit mode

Hi Yu!

Yes, I had used the wrong file. Then I got the ones for Sus scrofa, and did the MSigdb gene set analysis. Thanks for pointing out the mistake. 

The answer that I've posted, is that an acceptable approach to get a list of pathways? Thanks for your help.

ADD REPLY

Login before adding your answer.

Traffic: 370 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6