Question

question on GO analysis using clusterProfiler

0

Entering edit mode

giuseppe0525 • 0

@giuseppe0525-14327

Last seen 5.5 years ago

hello everyone,

I came across a problem when I did GO analysis on differentially expressed genes derived from microarray using clusterProfiler . I gave a list of DEGs but failed to map any gene with the enrichGO() function.

I used the following script to do the analysis:

“

data(geneList)

gene <- names(geneList)

gene

head(gene)

str(gene)

ego <- enrichGO(gene = gene,

universe = names(geneList),

OrgDb = org.Mm.eg.db,

ont = "BP",

pAdjustMethod = "BH",

pvalueCutoff = 0.05,

qvalueCutoff = 0.1,

minGSSize = 3,

maxGSSize = 500

)

head(ego)

head(summary(ego))

“

Unfortunately, an error message was returned saying

“

--> No gene can be mapped....

--> Expected input gene ID: 442829,71950,100041897,71711,11535,19264

--> return NULL...

> head(ego)

“

I obtained the data as input of enrichGO() from differentially expressed genes of microarray data using limma.

However, I got the GO enrichment result with “PANTHER” an online GO analysis tool.

I even loaded the entrez gene id from a txt file (as shown in attachment) but got the same error. The code to load the data was “genes <- read.csv("genes.txt", header = FALSE)“.

Could anyone help solve this problem?

Thanks in advance!

clusterprofiler geneontology • 7.0k views

ADD COMMENT • link updated 6.4 years ago by thokall ▴ 160 • written 6.4 years ago by giuseppe0525 • 0

0

Entering edit mode

Hi,

It looks like there might be some issues with your input, but it is not possible to offer a solution without knowing the content of your geneList object.

In addition with the code you show the same content [names(geneList)] is given to argument 'gene' and argument 'universe' in the enrichGO function call. I believe you instead should have all genes on the array as 'universe' and the differentially expressed genes as 'gene'.

ADD REPLY • link 6.4 years ago thokall ▴ 160

0

Entering edit mode

parts of the gene and universe as input are listed as followed:

> gene
##
   [1] "4312"   "8318"   "10874"  "55143"  "55388"  "991"    "6280"   "2305"   "9493"   "1062"   "3868"   "4605"   "9833"   "9133"  
  [15] "6279"   "10403"  "8685"   "597"    "7153"   "23397"  "6278"   "79733"  "259266" "1381"   "3627"   "27074"  "6241"   "55165" 
  [29] "9787"   "7368"   "11065"  "55355"  "9582"   "220134" "55872"  "51203"  "3669"   "83461"  "22974"  "10460"  "10563"  "4751"

> expst.id <- getEG(as.character(expst$NAME), "mouse430a2")
> head(expst.entzid)
##
      V1
1  54161
2  11972
3  57437
4 100678
5  60409
6  13481

Then I run the following script but got the same error:

> ego <- enrichGO(gene          = gene,
+                 keyType = "ENTREZID",
+                 universe      = expst.entzid,
+                 OrgDb         = org.Mm.eg.db,
+                 ont           = "BP",
+                 pAdjustMethod = "BH",
+                 pvalueCutoff  = 0.05,
+                 qvalueCutoff  = 0.1,
+                 minGSSize = 5,
+                 maxGSSize = 500
+                 )
##
--> No gene can be mapped....
--> Expected input gene ID: 16590,20662,69286,229357,21808,22415
--> return NULL...

Is it due to the NA given to the universe? Thanks!

ADD REPLY • link 6.4 years ago giuseppe0525 • 0

0

Entering edit mode

Hello,

I would bet that your problem is related to the coding of your gene Id's. Are they coded in Entrez ID? If not, try to map your IDs to Entrez ID and repeat your analysis. You could use bitr() to do that or externally through DAVID.

ADD REPLY • link 6.4 years ago Miguel.Cosenza ▴ 10

0

Entering edit mode

I have checked the coding of gene IDs and they were in Entrez ID.

I used bitr() function to transform gene IDs and it worked well.

"

gene.df <- bitr(gene, fromType = "ENTREZID",
toType = c("ENSEMBL", "SYMBOL"),
OrgDb = org.Mm.eg.db)
head(gene.df)

"

##

> head(gene.df)
  ENTREZID            ENSEMBL SYMBOL
1    16878 ENSMUSG00000034394    Lif
2    20310 ENSMUSG00000058427  Cxcl2
3    67951 ENSMUSG00000001473  Tubb6
4    14579 ENSMUSG00000028214    Gem
5    17392 ENSMUSG00000043613   Mmp3
7   104027 ENSMUSG00000043079  Synpo

ADD REPLY • link 6.4 years ago giuseppe0525 • 0

score 0 · Answer 1 · 2017-11-08


data(geneList)

ego <- enrichGO(gene          = gene,

                universe      = names(geneList),

                OrgDb         = org.Mm.eg.db,

...

The geneList was obtained via data(geneList), then you are using a vector of human genes as background while testing for mouse gene (OrgDb = org.Mm.eg.db).

If this is the case, of course all genes can't be mapped.

score 0 · Answer 2 · 2017-11-10

Hi,

Please try and clarify exactly what you have done. Your initial gene list does indeed as stated by Guangchuang Yu contain human geneIDs so they will not map to mouse. The code you supply work with mouse gene ids. The example below is your code and part of your example data, but adding a mouse entrezid (54611) and changing the minGSSize to 1

> dput(gene)
## c("54611", "4312", "8318", "10874")

> dput(uni)
## c("54611", "54161", "11972", "8312", "4312", "8318", "10874")

> ego <- enrichGO(gene          = gene,
+                  keytype = "ENTREZID",
+                  universe      = uni,
+                  OrgDb         = org.Mm.eg.db,
+                  ont           = "BP",
+                  pAdjustMethod = "BH",
+                  pvalueCutoff  = 0.05,
+                  qvalueCutoff  = 0.1,
+                  minGSSize = 1,
+                  maxGSSize = 500
+                  )

> ego
#
# over-representation test
#
#...@organism      Mus musculus 
#...@ontology      BP 
#...@keytype      ENTREZID 
#...@gene      chr [1:4] "54611" "4312" "8318" "10874"
#...pvalues adjusted by 'BH' with cutoff <0.05 
#...0 enriched terms found
'data.frame':    0 obs. of  9 variables:
 $ ID         : chr 
 $ Description: chr 
 $ GeneRatio  : chr 
 $ BgRatio    : chr 
 $ pvalue     : num 
 $ p.adjust   : num 
 $ qvalue     : num 
 $ geneID     : chr 
 $ Count      : int 
#...Citation
  Guangchuang Yu, Li-Gen Wang, Yanyan Han and Qing-Yu He.
  clusterProfiler: an R package for comparing biological themes among
  gene clusters. OMICS: A Journal of Integrative Biology
  2012, 16(5):284-287