GOSemSim - mgeneSim weird output
0
0
Entering edit mode
Giovanni • 0
@18eed1a1
Last seen 9 months ago
Italy

I started using mgeneSim function, and i passed an array of 11000+ EntrezID genes. It gave me a similarity matrix with some columns and rows all containing value "1". After mapping the EntrezID genes to GO, I noticed that some set of Go ID of two different genes have a GO ID in common. This should be the reason why I have these columns and rows set at 1.

How can I have consistent results from mgeneSim? Is it possible not to consider the GO ID two genes share?

A little example, with some of the gene pairs that make exception:

mgeneSim(c("3613", "83541", "5651", "23492", "157310"), semData=hsGO, measure="Wang", combine = "BMA", verbose = TRUE)

The output it returns is the following:output

GOSemSim similarity • 393 views
ADD COMMENT
0
Entering edit mode

You could change how to combine results from BMA to other methods, but the goal of GOSemSim is to provide a measure between two genes based on the Gene Ontology. That means that the GO ID of each genes should be used and if shared they should be dealt with it somehow. There are other measures for semantic similarity of genes based on GO, I am not sure if you picked Wang for a reason or just because it is the default.

Or you could use some other distance metric (not semantic similarity). I created a package for pathways and gene sets that uses the Dice similarity. If you provided the GO terms it could be used to calculate the dice similarity with the GO terms).

ADD REPLY

Login before adding your answer.

Traffic: 618 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6