Can sample-level weights be used when analysing GSVA scores in limma?
1
0
Entering edit mode
A.Barden ▴ 20
@abarden-23487
Last seen 13 hours ago
USA

Hello,

I've carried out a gene-level differential expression analysis in limma with a human-derived clinical RNA-Seq dataset, utilising voomWithQualityWeights in order to account for variation in sample quality and heteroscedasticity between groups.

I'd now like to run the same model in limma with GSVA scores derived from the MSigDB C2 collection of gene sets and some other previously defined gene sets that are of particular interest in our study.

Exploratory analysis of the GSVA scores (PCA and MDS) suggests that the sample quality variation/group heteroscedasticity that we observed at the gene-level is also there at the gene set-level, albeit to a lesser extent.

Would it therefore be problematic to re-use the sample weights calculated for the gene-level analysis in the gene set analysis (i.e. the weights stored in v$targets$sample.weights) or to use arrayWeights to generate sample-level weights from the GSVA scores?

I have in fact already tried both approaches and found, particularly when re-using the gene-level analysis weights, that the results for the gene set analysis are more similar to the gene-level analysis compared to when no weights are used.

By similarity I mean that the numbers of significant hits for each contrast are more aligned between the gene-level analysis and the gene set analysis when weights are used, i.e. contrasts with fewer significant DEGs have fewer DE gene sets and contrasts with more DEGs have more DE gene sets. When no weights are used in the gene set analysis, there is less alignment with the gene-level analysis.

Thanks very much again for your help

limma • 282 views
ADD COMMENT
2
Entering edit mode
@gordon-smyth
Last seen 9 hours ago
WEHI, Melbourne, Australia

Yes, you could probably use the voom sample weights for an analysis of the GSVA scores for the same data, although I have never tried it.

I would prefer though to use limma-trend for the GSVA scores with the sqrt number of genes in each set as the trend predictor (eBayes with trend=sqrt(ngenes))), and to re-estimate sample weights from the GSVA analysis, see The difference when using Limma compare between 3 conditions vs 2 conditions

ADD COMMENT
1
Entering edit mode

Hi Gordon, thanks for giving your advice on how to use sample weights with GSVA scores. I have not used so far sample weights with GSVA scores either, but my intuition is also that since sample weights are data-driven, it is probably better to re-calculate them from the GSVA scores using arrayWeights(). I also notice now that in this section of the GSVA vignette I was giving the wrong advice on how to set the trend parameter in the limma-trend pipeline (using ngenes instead of sqrt(ngenes)), I'll correct that today.

ADD REPLY
0
Entering edit mode

Thanks both for this useful information.

ADD REPLY

Login before adding your answer.

Traffic: 1373 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6