(GOSemSim package) secondary GO terms generating NA values
1
0
Entering edit mode
@lwillianpacheco-18611
Last seen 4.2 years ago
Brazil/Rio de Janeiro/UFRJ

Dear all,

I'm using GOSemSim for computing semantic similarities between GO terms.

When I performed the similarity analysis with Wang method there are no problem:

> library("GOSemSim")

> hsGOBP <- godata('org.Hs.eg.db', ont="BP")
preparing gene to GO mapping data...
preparing IC data...

> goSim("GO:0042346", "GO:0042346", semData=hsGOBP, measure="Wang")
[1] 1

But if I change the method to Resnik, Rel, Jiang or Lin, the analysis generate only one NA as result:

> goSim("GO:0042346", "GO:0042346", semData=hsGOBP, measure="Rel")
[1] NA

Manipulating the vector I discovered that the error is produced by secondary GO term, GO:0042346 (above).

Are there an alternative to analyze a vector with secondary GO terms?

 

Thanks Luis Arge

 

--

> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.5 LTS

Matrix products: default
BLAS: /usr/lib/openblas-base/libblas.so.3
LAPACK: /usr/lib/libopenblasp-r0.2.18.so

locale:
 [1] LC_CTYPE=pt_BR.UTF-8       LC_NUMERIC=C               LC_TIME=pt_BR.UTF-8       
 [4] LC_COLLATE=pt_BR.UTF-8     LC_MONETARY=pt_BR.UTF-8    LC_MESSAGES=pt_BR.UTF-8   
 [7] LC_PAPER=pt_BR.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=pt_BR.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] org.Hs.eg.db_3.7.0   AnnotationDbi_1.44.0 IRanges_2.16.0       S4Vectors_0.20.1    
[5] Biobase_2.42.0       BiocGenerics_0.28.0  GOSemSim_2.8.0      

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.0      GO.db_3.7.0     digest_0.6.18   DBI_1.0.0       RSQLite_2.1.1  
 [6] blob_1.1.1      tools_3.5.1     bit64_0.9-7     bit_1.1-14      compiler_3.5.1
[11] pkgconfig_2.0.2 memoise_1.1.0 

gosemsim • 1.2k views
ADD COMMENT
0
Entering edit mode
Guangchuang Yu ★ 1.2k
@guangchuang-yu-5419
Last seen 35 minutes ago
China/Guangzhou/Southern Medical Univer…
because IC-based methods are species-specific and the GO term is not valid for human. > require(org.Hs.eg.db) > get('GO:0042346', org.Hs.egGO2EG) Error in .checkKeys(value, Rkeys(x), x@ifnotfound) : value for "GO:0042346" not found On Sat, Dec 1, 2018 at 1:05 AM l.willianpacheco [bioc] < noreply@bioconductor.org> wrote: > Activity on a post you are following on support.bioconductor.org > > User l.willianpacheco <https: support.bioconductor.org="" u="" 18611=""/> wrote Question: > (GOSemSim package) secondary GO terms generating NA values > <https: support.bioconductor.org="" p="" 115680=""/>: > > Dear all, > > I'm using GOSemSim for computing semantic similarities between GO terms. > > When I performed the similarity analysis with Wang method there are no > problem: > > > library("GOSemSim") > > > hsGOBP <- godata('org.Hs.eg.db', ont="BP") > preparing gene to GO mapping data... > preparing IC data... > > > goSim("GO:0042346", "GO:0042346", semData=hsGOBP, measure="Wang") > [1] 1 > > But if I change the method to Resnik, Rel, Jiang or Lin, the analysis > generate only one NA as result: > > > goSim("GO:0042346", "GO:0042346", semData=hsGOBP, measure="Rel") > [1] NA > > Manipulating the vector I discovered that the error is produced by > secondary GO term, GO:0042346 (above). > > Are there an alternative to analyze a vector with secondary GO terms? > > > > Thanks Luis Arge > > > > -- > > > sessionInfo() > R version 3.5.1 (2018-07-02) > Platform: x86_64-pc-linux-gnu (64-bit) > Running under: Ubuntu 16.04.5 LTS > > Matrix products: default > BLAS: /usr/lib/openblas-base/libblas.so.3 > LAPACK: /usr/lib/libopenblasp-r0.2.18.so > > locale: > [1] LC_CTYPE=pt_BR.UTF-8 LC_NUMERIC=C > LC_TIME=pt_BR.UTF-8 > [4] LC_COLLATE=pt_BR.UTF-8 LC_MONETARY=pt_BR.UTF-8 > LC_MESSAGES=pt_BR.UTF-8 > [7] LC_PAPER=pt_BR.UTF-8 LC_NAME=C > LC_ADDRESS=C > [10] LC_TELEPHONE=C LC_MEASUREMENT=pt_BR.UTF-8 > LC_IDENTIFICATION=C > > attached base packages: > [1] parallel stats4 stats graphics grDevices utils datasets > methods base > > other attached packages: > [1] org.Hs.eg.db_3.7.0 AnnotationDbi_1.44.0 IRanges_2.16.0 > S4Vectors_0.20.1 > [5] Biobase_2.42.0 BiocGenerics_0.28.0 GOSemSim_2.8.0 > > loaded via a namespace (and not attached): > [1] Rcpp_1.0.0 GO.db_3.7.0 digest_0.6.18 DBI_1.0.0 > RSQLite_2.1.1 > [6] blob_1.1.1 tools_3.5.1 bit64_0.9-7 bit_1.1-14 > compiler_3.5.1 > [11] pkgconfig_2.0.2 memoise_1.1.0 > ------------------------------ > > Post tags: gosemsim > > You may reply via email or visit > (GOSemSim package) secondary GO terms generating NA values > -- --~--~---------~--~----~------------~-------~--~----~ Guangchuang Yu PhD Professor School of Basic Medical Sciences Southern Medical University Guangzhou, China www: https://guangchuangyu.github.io -~----------~----~----~----~------~----~------~--~---
ADD COMMENT
0
Entering edit mode

Thank you Guangchuang.

I'll verify all GO terms and then replace by their primary GO.

--

Luis Arge

ADD REPLY

Login before adding your answer.

Traffic: 1025 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6