Question

running apeglm on a normal model and multiple coefficients

0

Entering edit mode

Frederik Ziebell ▴ 20

@frederik-ziebell-14676

Last seen 19 days ago

Heidelberg, Germany

Hi Mike and others,

I want to use apeglm for effect-size shrinkage in a (normal) linear regression setting and I'm interested in shrinkage of almost all of the model coefficients. Here's a toy example:

library("apeglm")
library("DEP")
library("limma")
library("SummarizedExperiment")
library("dplyr")

# load data
data <- UbiLength %>% 
  filter(Reverse != "+", Potential.contaminant != "+") %>% 
  make_unique("Gene.names", "Protein.IDs", delim = ";")
LFQ_columns <- grep("LFQ.", colnames(data))
se <- make_se(data, LFQ_columns, UbiLength_ExpDesign)

# data and model matrix
Y <- assay(se)[!rowAnyMissings(assay(se)),]
X <- model.matrix(~condition, colData(se))

# model fitting
fit <- lmFit(Y, X)
coef <- fit$coefficients
coef_error <- fit$stdev.unscaled * fit$sigma

# normal log-likelihood
log_lik_norm <- function (y, x, beta, param, offset) {
  dnorm(y, mean = x %*% beta, sd = param, log = TRUE)
}

# apeglm
coef_nr <- 2
mle <- log(2) * cbind(coef[,coef_nr], coef_error[,coef_nr])
res <- apeglm(Y=Y, x=X, log.lik = log_lik_norm, param = rowSds(Y), mle=mle, coef=coef_nr, threshold = 1.2)

My questions are:

Is my setup for the normal log-likelihood correct?
Is it possible to shrink multiple coefficients simultaneously and obtain the corresponding s-values and fsr? In my application, Y has dimension 1500 x 50000 and X has 200 columns, which is why I don't want to run the model for each coefficient separately.

I also had a look at ashr, but testing against a threshold is not implemented there.

Best, Frederik

shrinkage apeglm • 817 views

ADD COMMENT • link 24 months ago Frederik Ziebell ▴ 20

score 0 · Answer 1 · 2022-04-27

Two issues I see.

You don't need the log(2) -- that's for the GLM.
The marginal samples variance of y is not the residual standard error. param should be the residual standard error here.

Not really, you can only pull out FSR, svalues and credible intervals for one at a time. See ?apeglm and the arguments coef, mle, and prior.control. You can in theory shrink simultaneously if you prespecified the scale of the Cauchy prior, but the function is only set up to estimate the prior for mle for one coef at a time.

With 200 columns, what you could do is scale them, then figure out a good scale for the prior from running apeglm on a subset, then use that scale for all 200. This is closer to what bayesglm does (uses a common prior across all non-intercept coefficients).