Interpretation of GSVA with absRanking=TRUE
1
0
Entering edit mode
@88108a1e
Last seen 2 days ago
Australia

Hi,

I'm having trouble interpreting the intended effect of the absRanking argument on GSVA. I'm running an example with three gene sets, one is strongly positive, one is strongly negative, and one without differential expression.

Standard GSVA (absRanking=FALSE) returns the expected results (up/down/none), but GSVA with absRanking=TRUE does not show any significant enrichment. My ultimate aim is to do ranking with mixed-sign genesets (up/down) but would've naively assumed that all-up / all-down are a special case of that and thus should work too.

Thanks for your help,


set.seed(19820)

library(GSVA)
library(limma)

p <- 1000 ## number of genes
n <- 100 ## number of samples
nGrp1 <- 50 ## number of samples in group 1
nGrp2 <- n - nGrp1 ## number of samples in group 2

# consider three disjoint gene sets
geneSets <- list(set1 = paste("g", 1:3, sep = ""),
set2 = paste("g", 4:6, sep = ""),
set3 = paste("g", 7:10, sep = ""))

# sample data from a normal distribution with mean 0 and st.dev. 1
x1 <- matrix(rnorm(n*p), nrow = p, ncol = n,
dimnames = list(paste("g", 1:p, sep = ""),
paste("s", 1:n, sep = "")))

# genes in set1 are expressed at higher levels in the last 'nGrp1+1' to 'n' samples
x1[geneSets$set1, (nGrp1+1):n] <- x1[geneSets$set1, (nGrp1+1):n] + 2

# Also set2 are lower
x1[geneSets$set2, (nGrp1+1):n] <- x1[geneSets$set2, (nGrp1+1):n] - 2

## build design matrix
design <- cbind(sampleGroup1 = 1, sampleGroup2vs1 = c(rep(0, nGrp1), rep(1, nGrp2)))

gsvapar1 <- gsvaParam(x1, geneSets, maxDiff = TRUE, absRanking = FALSE)
gsvapar2 <- gsvaParam(x1, geneSets, maxDiff = TRUE, absRanking = TRUE)

gsva_es1 <- gsva(gsvapar1)
gsva_es2 <- gsva(gsvapar2)

topTable(eBayes(lmFit(gsva_es1, design)), coef = "sampleGroup2vs1")
logFC      AveExpr          t      P.Value    adj.P.Val          B
set2 -1.06001420 -0.007121808 -27.712090 8.321766e-50 2.496530e-49 103.047997
set1  1.09278445 -0.003754398  26.819689 1.651951e-48 2.477926e-48 100.049550
set3 -0.07639161  0.008858442  -1.132907 2.598605e-01 2.598605e-01  -7.663724

topTable(eBayes(lmFit(gsva_es2, design)), coef = "sampleGroup2vs1")
logFC   AveExpr          t   P.Value adj.P.Val         B
set1 -0.02693682 0.7847923 -1.2968183 0.1957106 0.4985793 -5.381740
set2  0.02016743 0.7983954  0.9709200 0.3323862 0.4985793 -5.735982
set3 -0.01100966 0.6397191 -0.5300375 0.5964860 0.5964860 -6.053170

sessionInfo( )
R version 4.1.1 (2021-08-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: ---

Matrix products: default
BLAS/LAPACK: /usr/lib64/libopenblas-r0.3.3.so

locale:
[1] LC_CTYPE=en_AU.UTF-8       LC_NUMERIC=C               LC_TIME=en_AU.UTF-8        LC_COLLATE=en_AU.UTF-8     LC_MONETARY=en_AU.UTF-8    LC_MESSAGES=en_AU.UTF-8    LC_PAPER=en_AU.UTF-8
[8] LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base

other attached packages:
[1] limma_3.50.3 GSVA_1.50.1

GSVA • 161 views
1
Entering edit mode
Robert Castelo ★ 3.3k
@rcastelo
Last seen 3 days ago
Barcelona/Universitat Pompeu Fabra

Hi,

Thanks for bringing this up, the documentation is confusing about what to do when you have non-directional gene sets, what you call mixed-sign gene sets. I have just fixed that bit of the documentation in the devel and release (1.52.3) versions of GSVA, which will become available for update via BiocManager::install() in the next 24/48 hours. The GSVA paper, however, gives the right hint in the paragraph after Equation (5). Essentially, you should use maxDiff=FALSE and forget about the absRanking parameter.

robert.

0
Entering edit mode

Thanks Robert, appreciate the quick reply (and fix).