Question: Are singscore scores comparable between gene sets?
1
10 weeks ago by
Pietro20
Pietro20 wrote:

Hello everyone

Regarding the package singscore, in particular the function simpleScore, that allows to score a gene expression dataset based on one or two gene sets, I was wondering if the signature scores from different gene sets are comparable.

Say I have 4 gene sets that I use to classify tumor samples to molecular subtypes, my idea is to score the gene expression dataset with each one of the 4 gene sets separately, and then compare the signature scores across the 4 gene sets for each sample.

I would like to know if the scores are comparable in absolute terms.

Here an example output (note that I run each gene set separately and then merged the results)

id  Gene_set_1  Gene_set_2  Gene_set_3  Gene_set_4
sample_1    -0.0625 -0.194  0.298   0.182
sample_2    -0.0706 -0.211  0.273   0.218
sample_3    0.0366  -0.204  0.183   0.263
sample_4    -0.0219 -0.221  0.325   0.215
sample_5    -0.0215 -0.232  0.267   0.2
sample_6    -0.00629    -0.186  0.205   0.255
sample_7    -0.0425 -0.202  0.177   0.217
sample_8    -0.0985 -0.219  0.252   0.191
sample_9    -0.0726 -0.194  0.272   0.154
sample_10   -0.0513 -0.226  0.245   0.161


Can I say for example that for sample_1 the gene sets scores ranked are: Gene_set_3 > Gene_set_4 > Gene_set_1 > Gene_set_2 ?

Thanks

PS: cross-posted to biostars

modified 10 weeks ago by Robert Castelo2.3k • written 10 weeks ago by Pietro20
Answer: Are singscore scores comparable between gene sets?
1
10 weeks ago by
Robert Castelo2.3k
Spain/Barcelona/Universitat Pompeu Fabra
Robert Castelo2.3k wrote:

Hi,

I can't comment on singscore, but the "gsva" default method implemented in the GSVA package, tries to make scores comparable across gene sets by first bringing gene expression profiles to a common scale, before summarizing the expression at gene set level. You can learn about the full method in the GSVA paper. However, as explained in the discussion of the paper, to ensure that this step I mentioned works well, you should have at least 10 samples in your data set.

cheers,

robert.

Hi Robert

Thanks for the answer. I know very well the GSVA package, I use it everyday for the same purpose. My goal was trying to do the same with the singscore package and compare the results to see how the latter performs.

Thanks anyway

Pietro