Entering edit mode
Hello, just a quick question today. I am an undergraduate using cluster profiler for functional analysis of gene sets. The graphs being made by enrichGO and GSEA put out a variable called "count", but this does not line up with the numbers parameters we are putting in. I was wondering if anyone has an explanation on exactly what "count" is counting? We think it is some transformation of our variable "set_size", which is count numbers of the gene sets, but it appears as though there is some sort of transformation put on it that I can find no info on. Thanks!
The documentation is a bit sparse. It appears that the Count column is being created by the
enricher_internal
function on line 148. My guess is that it is calculating the size of each set according to what is provided inUSER_DATA
, which would be theGO_DATA
object created by theenrichGO
function on line 43. That is, the sizes of sets before filtering to what is actually present in the dataset. It is just pulling GO sets from an external database. Would have to look insideget_GO_data
to learn exactly what is happening, so you might start there.