Search
Question: question on GO analysis using clusterProfiler
0
gravatar for giuseppe0525
3 months ago by
giuseppe05250 wrote:

hello everyone,

 

I came across a problem when I did GO analysis on differentially expressed genes derived from microarray using clusterProfiler . I gave a list of DEGs but failed to map any gene with the enrichGO() function.

 

I used the following script to do the analysis:

 

data(geneList)

gene <- names(geneList)

gene

head(gene)

str(gene)

 

 

ego <- enrichGO(gene          = gene,

                universe      = names(geneList),

                OrgDb         = org.Mm.eg.db,

                ont           = "BP",

                pAdjustMethod = "BH",

                pvalueCutoff  = 0.05,

                qvalueCutoff  = 0.1,

                minGSSize = 3,

                maxGSSize = 500

                )

head(ego)

head(summary(ego))

Unfortunately, an error message was returned saying

--> No gene can be mapped....

--> Expected input gene ID: 442829,71950,100041897,71711,11535,19264

--> return NULL...

> head(ego)

 

I obtained the data as input of enrichGO() from differentially expressed genes of microarray data using limma.

 

However, I got the GO enrichment result with “PANTHER” an online GO analysis tool.

 

I even loaded the entrez gene id from a txt file (as shown in attachment) but got the same error. The code to load the data was “genes <- read.csv("genes.txt", header = FALSE)“.

Could anyone help solve this problem?

Thanks in advance!

 

ADD COMMENTlink modified 3 months ago by thokall90 • written 3 months ago by giuseppe05250

Hi,

It looks like there might be some issues with your input, but it is not possible to offer a solution without knowing the content of your geneList object.

In addition with the code you show the same content [names(geneList)] is given to argument 'gene' and argument 'universe' in the enrichGO function call. I believe you instead should have all genes on the array as 'universe' and the differentially expressed genes as 'gene'.

ADD REPLYlink written 3 months ago by thokall90

parts of the gene and universe as input are listed as followed: 

> gene
##
   [1] "4312"   "8318"   "10874"  "55143"  "55388"  "991"    "6280"   "2305"   "9493"   "1062"   "3868"   "4605"   "9833"   "9133"  
  [15] "6279"   "10403"  "8685"   "597"    "7153"   "23397"  "6278"   "79733"  "259266" "1381"   "3627"   "27074"  "6241"   "55165" 
  [29] "9787"   "7368"   "11065"  "55355"  "9582"   "220134" "55872"  "51203"  "3669"   "83461"  "22974"  "10460"  "10563"  "4751"  

 

> expst.id <- getEG(as.character(expst$NAME), "mouse430a2")
> head(expst.entzid)
##
      V1
1  54161
2  11972
3  57437
4 100678
5  60409
6  13481

 

Then I run the following script but got the same error:

> ego <- enrichGO(gene          = gene,
+                 keyType = "ENTREZID",
+                 universe      = expst.entzid,
+                 OrgDb         = org.Mm.eg.db,
+                 ont           = "BP",
+                 pAdjustMethod = "BH",
+                 pvalueCutoff  = 0.05,
+                 qvalueCutoff  = 0.1,
+                 minGSSize = 5,
+                 maxGSSize = 500
+                 )
##
--> No gene can be mapped....
--> Expected input gene ID: 16590,20662,69286,229357,21808,22415
--> return NULL...

 

Is it due to the NA given to the universe? Thanks!

ADD REPLYlink modified 3 months ago • written 3 months ago by giuseppe05250

Hello, 

I would bet that your problem is related to the coding of your gene Id's. Are they coded in Entrez ID? If not, try to map your IDs to Entrez ID and repeat your analysis. You could use bitr() to do that or externally through DAVID. 

ADD REPLYlink written 3 months ago by Miguel.Cosenza10

 

I have checked the coding of gene IDs and they were in Entrez ID.

I used bitr() function to transform gene IDs and it worked well.

"

gene.df <- bitr(gene, fromType = "ENTREZID",
                toType = c("ENSEMBL", "SYMBOL"),
                OrgDb = org.Mm.eg.db)
head(gene.df)

"

## 

> head(gene.df)
  ENTREZID            ENSEMBL SYMBOL
1    16878 ENSMUSG00000034394    Lif
2    20310 ENSMUSG00000058427  Cxcl2
3    67951 ENSMUSG00000001473  Tubb6
4    14579 ENSMUSG00000028214    Gem
5    17392 ENSMUSG00000043613   Mmp3
7   104027 ENSMUSG00000043079  Synpo
ADD REPLYlink written 3 months ago by giuseppe05250
0
gravatar for Guangchuang Yu
3 months ago by
Hong Kong
Guangchuang Yu890 wrote:

data(geneList)

ego <- enrichGO(gene          = gene,

                universe      = names(geneList),

                OrgDb         = org.Mm.eg.db,

...

The geneList was obtained via data(geneList), then you are using a vector of human genes as background while testing for mouse gene (OrgDb = org.Mm.eg.db).

If this is the case, of course all genes can't be mapped.

ADD COMMENTlink written 3 months ago by Guangchuang Yu890
0
gravatar for thokall
3 months ago by
thokall90
Swedish Museum of Natural History
thokall90 wrote:

Hi,

Please try and clarify exactly what you have done. Your initial gene list does indeed as stated by  Guangchuang Yu contain human geneIDs so they will not map to mouse. The code you supply work with mouse gene ids. The example below is your code and part of your example data, but adding a mouse entrezid (54611) and changing the minGSSize to 1

> dput(gene)
## c("54611", "4312", "8318", "10874")

> dput(uni)
## c("54611", "54161", "11972", "8312", "4312", "8318", "10874")

> ego <- enrichGO(gene          = gene,
+                  keytype = "ENTREZID",
+                  universe      = uni,
+                  OrgDb         = org.Mm.eg.db,
+                  ont           = "BP",
+                  pAdjustMethod = "BH",
+                  pvalueCutoff  = 0.05,
+                  qvalueCutoff  = 0.1,
+                  minGSSize = 1,
+                  maxGSSize = 500
+                  )

> ego
#
# over-representation test
#
#...@organism      Mus musculus 
#...@ontology      BP 
#...@keytype      ENTREZID 
#...@gene      chr [1:4] "54611" "4312" "8318" "10874"
#...pvalues adjusted by 'BH' with cutoff <0.05 
#...0 enriched terms found
'data.frame':    0 obs. of  9 variables:
 $ ID         : chr 
 $ Description: chr 
 $ GeneRatio  : chr 
 $ BgRatio    : chr 
 $ pvalue     : num 
 $ p.adjust   : num 
 $ qvalue     : num 
 $ geneID     : chr 
 $ Count      : int 
#...Citation
  Guangchuang Yu, Li-Gen Wang, Yanyan Han and Qing-Yu He.
  clusterProfiler: an R package for comparing biological themes among
  gene clusters. OMICS: A Journal of Integrative Biology
  2012, 16(5):284-287 

ADD COMMENTlink written 3 months ago by thokall90
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 267 users visited in the last hour