Got an confusing about BgRatio in enrichKEGG result of clusterprofiler
1
0
Entering edit mode
@naturehunger-10178
Last seen 8.0 years ago

I am trying to runing the example of clusterprofiler from  "http://www.bioconductor.org/packages/devel/bioc/vignettes/clusterProfiler/inst/doc/clusterProfiler.html#abstract" and "https://guangchuangyu.github.io/2016/05/convert-biological-id-with-kegg-api-using-clusterprofiler/". I got an question in BgRatio column. Why these two example have a big different in background genes? I check the newest database of human KEGG pathway. It only has the 7192 genes. Am I make a mistake?

in bioconductor's example, the background genes have 7164. 

kk <- enrichKEGG(gene         = gene,
                 organism     = 'hsa',
                 pvalueCutoff = 0.05)
head(kk)

...

##           BgRatio       pvalue     p.adjust       qvalue
## hsa04110 124/7164 1.706341e-07 2.951969e-05 0.0000290976
## hsa04114 124/7164 1.569415e-06 1.357544e-04 0.0001338133
## hsa03320  72/7164 1.884398e-05 1.086670e-03 0.0010711317
## hsa04914  98/7164 9.664771e-04 4.180013e-02 0.0412024432
## hsa04115  69/7164 1.226862e-03 4.244943e-02 0.0418424543
## hsa04062 187/7164 1.485638e-03 4.283588e-02 0.0422233843

...

in github's example, the background genes have 9275.

x <- enrichKEGG(np2up[,2], organism='hsa', keyType='uniprot')

...

## BgRatio pvalue p.adjust qvalue
## hsa04072 216/9275 0.0002654190 0.03901659 0.03240905
## hsa04060 354/9275 0.0005349245 0.03931695 0.03265855
## hsa04390 213/9275 0.0009536247 0.04199404 0.03488227
## hsa04975 58/9275 0.0014014886 0.04199404 0.03488227
## hsa05221 86/9275 0.0014283687 0.04199404 0.03488227

...
clusterprofiler • 3.2k views
ADD COMMENT
0
Entering edit mode
Guangchuang Yu ★ 1.2k
@guangchuang-yu-5419
Last seen 8 weeks ago
China/Guangzhou/Southern Medical Univer…

the number change since your input keyType was change.

The acutal gene annotation is 7164 (may change when KEGG updated), this annotation is based on ENTREZGENEID. When we mappped the gene ID to uniprot, the number of protein annotated increase since multiple mapping exists (ID mapping is not alwasy 1-to-1).

 

 

 

ADD COMMENT
0
Entering edit mode

Thanks a lot ! I am trying to imply it into differentially expression genes list from my projects. And I had try these two methods. The number of genes in the same enriched KEGG pathway are different and consequently results to the KEGG pathway rank differently. So that I can only put my gene list into bioconductor's example? I can't convert my gene list into uniprot ID to analysis? 

ADD REPLY
0
Entering edit mode

it depends whether your input list is at gene level or protein level.

 


 

ADD REPLY
0
Entering edit mode

Thanks. I have a another question. why setReadable function can't support in enrichKEGG function output with entrenz gene ID but can be used in uniprot ID. I had google that the  previously version had the readable parameter, but it is useless now.

ADD REPLY
0
Entering edit mode

`setReadable` function is always exists and work with enrichKEGG output.

 

I guess you mean the `readable` parameter.

 

Since now enrichKEGG work with online data and support more than 4000 species. For most of the speices, there are no data to support ID conversion. So `readable` parameter was removed since enrichKEGG supports using online data.

 

For those species that have OrgDb object/package available, you can still convert ID using `setReadable` function.

 

If some ID types can work for you and some cannot. Follow the guide, https://guangchuangyu.github.io/2016/07/how-to-bug-author/, and post a reproducible example.

 

ADD REPLY
0
Entering edit mode

Thank you very much! It is the reason that I am not using the newest version of clusterProfilier. It might be the reason of my Bioconducter(V3.3) is not the newest version. So when I following the installation instructions as:

source("https://bioconductor.org/biocLite.R")
biocLite("clusterProfiler")

It download the clusterProfiler 3.0.5 automatically. But the up to date version is 3.2.8. 

ADD REPLY
0
Entering edit mode

The release version of Bioconductor is 3.4.

 

You should use the latest clusterProfiler.

 

see `setReadable` session in https://guangchuangyu.github.io/2016/05/convert-biological-id-with-kegg-api-using-clusterprofiler/

 

If your input gene list is entrez gene IDs.

You can use something like:

 

y <- setReadable(x, 'org.Hs.eg.db', keytype="ENTREZID")

 
ADD REPLY
0
Entering edit mode

Thanks! It works fine now! 

ADD REPLY

Login before adding your answer.

Traffic: 417 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6