Search
Question: question on GO analysis using clusterProfiler
0
gravatar for giuseppe0525
16 days ago by
giuseppe05250 wrote:

hello everyone,

 

I came across a problem when I did GO analysis on differentially expressed genes derived from microarray using clusterProfiler . I gave a list of DEGs but failed to map any gene with the enrichGO() function.

 

I used the following script to do the analysis:

 

data(geneList)

gene <- names(geneList)

gene

head(gene)

str(gene)

 

 

ego <- enrichGO(gene          = gene,

                universe      = names(geneList),

                OrgDb         = org.Mm.eg.db,

                ont           = "BP",

                pAdjustMethod = "BH",

                pvalueCutoff  = 0.05,

                qvalueCutoff  = 0.1,

                minGSSize = 3,

                maxGSSize = 500

                )

head(ego)

head(summary(ego))

Unfortunately, an error message was returned saying

--> No gene can be mapped....

--> Expected input gene ID: 442829,71950,100041897,71711,11535,19264

--> return NULL...

> head(ego)

 

I obtained the data as input of enrichGO() from differentially expressed genes of microarray data using limma.

 

However, I got the GO enrichment result with “PANTHER” an online GO analysis tool.

 

I even loaded the entrez gene id from a txt file (as shown in attachment) but got the same error. The code to load the data was “genes <- read.csv("genes.txt", header = FALSE)“.

Could anyone help solve this problem?

Thanks in advance!

 

ADD COMMENTlink modified 14 days ago by thokall60 • written 16 days ago by giuseppe05250

Hi,

It looks like there might be some issues with your input, but it is not possible to offer a solution without knowing the content of your geneList object.

In addition with the code you show the same content [names(geneList)] is given to argument 'gene' and argument 'universe' in the enrichGO function call. I believe you instead should have all genes on the array as 'universe' and the differentially expressed genes as 'gene'.

ADD REPLYlink written 16 days ago by thokall60

parts of the gene and universe as input are listed as followed: 

> gene
##
   [1] "4312"   "8318"   "10874"  "55143"  "55388"  "991"    "6280"   "2305"   "9493"   "1062"   "3868"   "4605"   "9833"   "9133"  
  [15] "6279"   "10403"  "8685"   "597"    "7153"   "23397"  "6278"   "79733"  "259266" "1381"   "3627"   "27074"  "6241"   "55165" 
  [29] "9787"   "7368"   "11065"  "55355"  "9582"   "220134" "55872"  "51203"  "3669"   "83461"  "22974"  "10460"  "10563"  "4751"  

 

> expst.id <- getEG(as.character(expst$NAME), "mouse430a2")
> head(expst.entzid)
##
      V1
1  54161
2  11972
3  57437
4 100678
5  60409
6  13481

 

Then I run the following script but got the same error:

> ego <- enrichGO(gene          = gene,
+                 keyType = "ENTREZID",
+                 universe      = expst.entzid,
+                 OrgDb         = org.Mm.eg.db,
+                 ont           = "BP",
+                 pAdjustMethod = "BH",
+                 pvalueCutoff  = 0.05,
+                 qvalueCutoff  = 0.1,
+                 minGSSize = 5,
+                 maxGSSize = 500
+                 )
##
--> No gene can be mapped....
--> Expected input gene ID: 16590,20662,69286,229357,21808,22415
--> return NULL...

 

Is it due to the NA given to the universe? Thanks!

ADD REPLYlink modified 14 days ago • written 14 days ago by giuseppe05250

Hello, 

I would bet that your problem is related to the coding of your gene Id's. Are they coded in Entrez ID? If not, try to map your IDs to Entrez ID and repeat your analysis. You could use bitr() to do that or externally through DAVID. 

ADD REPLYlink written 16 days ago by Miguel.Cosenza10

 

I have checked the coding of gene IDs and they were in Entrez ID.

I used bitr() function to transform gene IDs and it worked well.

"

gene.df <- bitr(gene, fromType = "ENTREZID",
                toType = c("ENSEMBL", "SYMBOL"),
                OrgDb = org.Mm.eg.db)
head(gene.df)

"

## 

> head(gene.df)
  ENTREZID            ENSEMBL SYMBOL
1    16878 ENSMUSG00000034394    Lif
2    20310 ENSMUSG00000058427  Cxcl2
3    67951 ENSMUSG00000001473  Tubb6
4    14579 ENSMUSG00000028214    Gem
5    17392 ENSMUSG00000043613   Mmp3
7   104027 ENSMUSG00000043079  Synpo
ADD REPLYlink written 14 days ago by giuseppe05250
0
gravatar for Guangchuang Yu
16 days ago by
Hong Kong
Guangchuang Yu800 wrote:

data(geneList)

ego <- enrichGO(gene          = gene,

                universe      = names(geneList),

                OrgDb         = org.Mm.eg.db,

...

The geneList was obtained via data(geneList), then you are using a vector of human genes as background while testing for mouse gene (OrgDb = org.Mm.eg.db).

If this is the case, of course all genes can't be mapped.

ADD COMMENTlink written 16 days ago by Guangchuang Yu800
0
gravatar for thokall
14 days ago by
thokall60
Uppsala University
thokall60 wrote:

Hi,

Please try and clarify exactly what you have done. Your initial gene list does indeed as stated by  Guangchuang Yu contain human geneIDs so they will not map to mouse. The code you supply work with mouse gene ids. The example below is your code and part of your example data, but adding a mouse entrezid (54611) and changing the minGSSize to 1

> dput(gene)
## c("54611", "4312", "8318", "10874")

> dput(uni)
## c("54611", "54161", "11972", "8312", "4312", "8318", "10874")

> ego <- enrichGO(gene          = gene,
+                  keytype = "ENTREZID",
+                  universe      = uni,
+                  OrgDb         = org.Mm.eg.db,
+                  ont           = "BP",
+                  pAdjustMethod = "BH",
+                  pvalueCutoff  = 0.05,
+                  qvalueCutoff  = 0.1,
+                  minGSSize = 1,
+                  maxGSSize = 500
+                  )

> ego
#
# over-representation test
#
#...@organism      Mus musculus 
#...@ontology      BP 
#...@keytype      ENTREZID 
#...@gene      chr [1:4] "54611" "4312" "8318" "10874"
#...pvalues adjusted by 'BH' with cutoff <0.05 
#...0 enriched terms found
'data.frame':    0 obs. of  9 variables:
 $ ID         : chr 
 $ Description: chr 
 $ GeneRatio  : chr 
 $ BgRatio    : chr 
 $ pvalue     : num 
 $ p.adjust   : num 
 $ qvalue     : num 
 $ geneID     : chr 
 $ Count      : int 
#...Citation
  Guangchuang Yu, Li-Gen Wang, Yanyan Han and Qing-Yu He.
  clusterProfiler: an R package for comparing biological themes among
  gene clusters. OMICS: A Journal of Integrative Biology
  2012, 16(5):284-287 

ADD COMMENTlink written 14 days ago by thokall60
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 117 users visited in the last hour