Error in .checkKeys(value, Lkeys(x), x@ifnotfound) : value for "GO:0000059" not found
3
1
Entering edit mode
anton.kratz ▴ 60
@antonkratz-8836
Last seen 6 weeks ago
United States, San Diego, UCSD

Using the GOSemSim package, I am trying to generate a data frame which contains the semantic similarity between any two gene identifiers in yeast, like this (minimal, reproducible example):

library("GOSemSim")
yeastGO <- godata('org.Sc.sgd.db', keytype = "ENSEMBL", ont="BP", computeIC = TRUE)
genes <- keys(org.Sc.sgd.db, keytype="ENSEMBL")
my_sim_matrix <- mgeneSim (genes,
                           semData = yeastGO,
                           measure = "Resnik",
                           combine = "max",
                           verbose = F)

However, I run into this error:

Error in .checkKeys(value, Lkeys(x), x@ifnotfound) :
  value for "GO:0000059" not found

Tried various things and reading, cannot figure out what is going on there. genes is an array which I got right out of org.Sc.sgd.db, so I would assume all genes and GO terms are properly matched?!? Any help would be very much appreciated.

P.S.:

> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.3 LTS

Matrix products: default
BLAS: /usr/lib/atlas-base/atlas/libblas.so.3.0
LAPACK: /usr/lib/atlas-base/atlas/liblapack.so.3.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets
[8] methods   base     

other attached packages:
[1] org.Sc.sgd.db_3.5.0  AnnotationDbi_1.40.0 IRanges_2.12.0      
[4] S4Vectors_0.16.0     Biobase_2.38.0       BiocGenerics_0.24.0
[7] GOSemSim_2.4.1      

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.15    GO.db_3.5.0     digest_0.6.15   DBI_0.7        
 [5] RSQLite_2.0     pillar_1.1.0    rlang_0.1.6     blob_1.1.0     
 [9] tools_3.4.3     bit64_0.9-7     bit_1.1-12      compiler_3.4.3
[13] pkgconfig_2.0.1 memoise_1.1.0   tibble_1.4.2
GOSemSim gosemsim gene ontology • 1.1k views
ADD COMMENT
4
Entering edit mode
@james-w-macdonald-5106
Last seen 12 hours ago
United States

It looks like there are three GO IDs in the org.Sc.sgd.db package that have been deprecated by the GO consortium, and you have found one of them. It's not clear how those terms got included, as for example the term you are talking about was deprecated over a year ago.

> z <- unique(keys(org.Sc.sgd.db, "GOALL"))
> z[!z %in% keys(GO.db)]
[1] "GO:0000059" "GO:0031684" "GO:0003840"
ADD COMMENT
2
Entering edit mode
Guangchuang Yu ★ 1.2k
@guangchuang-yu-5419
Last seen 4 months ago
China/Guangzhou/Southern Medical Univer…

> packageVersion("GOSemSim")
[1] ‘2.5.1’
> library("GOSemSim")
> yeastGO <- godata('org.Sc.sgd.db', keytype = "ENSEMBL", ont="BP", computeIC = TRUE)
preparing gene to GO mapping data...
preparing IC data...
> goSim("GO:0000059", "GO:0030437", yeastGO, 'Resnik')
[1] NA
>

Already updated GOSemSim to output NA if the input ID was deprecated.

ADD COMMENT
0
Entering edit mode

Hi Guangchuang, great that the original author chimes in as well, I do not know how to update to 2.5.1, though (I am on Bioconductor version 3.6, R version 3.4.3).

Another question. I was wondering about the exact way you are computing Resnik. In my understanding Resnik in the original 1996, 1999 papers is only concerned with IS-A taxonomies. Gene Ontology contains many different types of relationships including "is a", "part of", "has part" and many more. The similarity values I get from GOSemSim for Resnik are only with respect to IS-A relationships? Or, all relationships are treated equally, i.e. everything is treated as if it were an IS-A relationship? I could not find a statement on this in the GOSemSim paper or vignette, could I kindly ask you for a comment regarding the treatment of different relationship types?

 

ADD REPLY
1
Entering edit mode

you can use devtools::install_github("guangchuangyu/GOSemSim") to install 2.5.1.

In GOSemSim, all the relationships are treated equally for IC-based methods.

ADD REPLY

Login before adding your answer.

Traffic: 442 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6