Hi,
How is it possible that in the similarity matrix resulting from the mgeneSim() function (gene semantic similarity measurement, R package GoSemSIm), there are no 1's on the diagonal ? They are only there when using the measure="Wang" method but for the rest, such as "Resnik", "Rel" etc. there is a value different from 1 ?
library(GOSemSim)
hsGO <- godata('org.Hs.eg.db', ont="MF")
mgeneSim(genes=c("835", "5261","241", "994"), semData=hsGO, measure="Rel",verbose=FALSE)
## 835 5261 241 994
## 835 0.947 0.400 0.255 0.488
## 5261 0.400 0.908 0.325 0.462
## 241 0.255 0.325 0.941 0.273
## 994 0.488 0.462 0.273 0.907
The example is taken from
https://yulab-smu.top/biomedical-knowledge-mining-book/GOSemSim.html
So how is the similarity in GoSemSim calculated?
The results are completely different from those provided by the GOSim package and the getGeneSim() function for the same similarity measures (e.g. on the diagonal, the similarity matrix is 1, as you would expect). I was comparing whether the two packages give the same results, but apparently not.
This is a a known problem reported initially in 2019: https://github.com/YuLab-SMU/GOSemSim/issues/23. You might port the fix from that package to GOSemSim and hope it is fixed in next release.