Question

Block weights in scran::pairwiseTTests

0

Entering edit mode

Angelos Armen • 0

@angelos-armen-21507

Last seen 2.4 years ago

United Kingdom

Hi Aaron,

In the code of scran::pairwiseTTests (current version on Github), you have this commented out:

In theory, a better weighting scheme would be to use the estimated standard error of the log-fold change to compute the weight. This would be more responsive to differences in variance between blocking levels, focusing on levels with low variance and high power. However, this is not safe in practice as genes with many zeroes can have very low standard errors, dominating the results inappropriately.

If I understand correctly, are you referring to a situation like in the example below? Here I have a single gene, two blocks, and two groups. The sample size is equal across blocks and groups. The fold change is the same in the two blocks, but the first block has more zeroes. Weighting by the reciprocal of the squared standard error would indeed result in the first block dominating the result.

library(metapod)

# equal sample size across groups and blocks
n <- 100

# create counts with n_1 ones and n - n_1 zeroes
create_counts <- function(n_1) c(rep(1, n_1), rep(0, n - n_1))

# create block 1 counts
x_1 <- create_counts(1)
y_1 <- create_counts(2)

# create block 2 counts
x_2 <- create_counts(10)
y_2 <- create_counts(20)

# run t-tests per block
out_1 <- t.test(x_1, y_1)
out_2 <- t.test(x_2, y_2)

# get p-values and weights
p_values <- list(out_1$p.value, out_2$p.value)
p_values
weights <- c(1/out_1$stderr^2, 1/out_2$stderr^2)
weights

[[1]] [1] 0.5631136

[[2]] [1] 0.04807781

[1] 3355.932 396.000

# combine p-values 
combineParallelPValues(p_values, method = "stouffer", weights = weights)$p.value

# combine p-values without weights (as sample sizes are equal)
combineParallelPValues(p_values, method = "stouffer")$p.value

[1] 0.4851627 [1] 0.1436335

scran • 1.0k views

ADD COMMENT • link 2.5 years ago • updated 2.4 years ago Angelos Armen • 0

score 2 · Accepted Answer · 2021-11-19

2

Entering edit mode

Aaron Lun ★ 28k

@alun

Last seen 4 hours ago

The city by the bay

Yes, especially with the large number of zeroes that deflate the variance in low-coverage blocks. This makes things unpredictable and unintuitive - for example, the precision weighting favors low-coverage blocks with few cells where you might be lucky enough to get an all-zero group that drives the pooled variance towards zero.