GSVA lots of positive values, is this expected behavior?
1
0
Entering edit mode
@owenwhitley-15693
Last seen 2.4 years ago

Hi,

I'm re-analyzing data from Yuan et al. 2018 (https://genomemedicine.biomedcentral.com/articles/10.1186/s13073-018-0567-9) with 8 high grade glioma samples. For a particular sample, I log normalize data using the method of Lun et al. (2016) and run GSVA on a subset of cells (putative cancer cells) using some gene signatures, and in for one gene signature, I see what appear to be consistently positive values. Here's the plot:

image

As you can see, for the gene set 'RNA.GSC.c2' (which is composed of about 1200 genes out of 4800 used in this analysis), we have very few samples below 0. Since GSVA's rank based score is deals with genes ranked by relative expression in a dataset, I was a bit surprised by this result. Do you think this could be due to the existence of outliers with extremely low log counts?

Here are the sample means for the gene set image

Thanks

gsva • 336 views
ADD COMMENT
0
Entering edit mode
Robert Castelo ★ 2.7k
@rcastelo
Last seen 10 weeks ago
Barcelona/Universitat Pompeu Fabra

Hi,

I would definitely remove lowly-expressed genes prior to running GSVA, just as you would do with differential expression. The fact that a gene set has consistently positive scores across samples means to me that its constituent genes are highly ranked in expression values across samples.

cheers,

robert.

ADD COMMENT

Login before adding your answer.

Traffic: 331 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6