running apeglm on a normal model and multiple coefficients
1
1
Entering edit mode
@frederik-ziebell-14676
Last seen 7 months ago
Heidelberg, Germany

Hi Mike and others,

I want to use apeglm for effect-size shrinkage in a (normal) linear regression setting and I'm interested in shrinkage of almost all of the model coefficients. Here's a toy example:

library("apeglm")
library("DEP")
library("limma")
library("SummarizedExperiment")
library("dplyr")

# load data
data <- UbiLength %>% 
  filter(Reverse != "+", Potential.contaminant != "+") %>% 
  make_unique("Gene.names", "Protein.IDs", delim = ";")
LFQ_columns <- grep("LFQ.", colnames(data))
se <- make_se(data, LFQ_columns, UbiLength_ExpDesign)

# data and model matrix
Y <- assay(se)[!rowAnyMissings(assay(se)),]
X <- model.matrix(~condition, colData(se))

# model fitting
fit <- lmFit(Y, X)
coef <- fit$coefficients
coef_error <- fit$stdev.unscaled * fit$sigma

# normal log-likelihood
log_lik_norm <- function (y, x, beta, param, offset) {
  dnorm(y, mean = x %*% beta, sd = param, log = TRUE)
}

# apeglm
coef_nr <- 2
mle <- log(2) * cbind(coef[,coef_nr], coef_error[,coef_nr])
res <- apeglm(Y=Y, x=X, log.lik = log_lik_norm, param = rowSds(Y), mle=mle, coef=coef_nr, threshold = 1.2)

My questions are:

  1. Is my setup for the normal log-likelihood correct?
  2. Is it possible to shrink multiple coefficients simultaneously and obtain the corresponding s-values and fsr? In my application, Y has dimension 1500 x 50000 and X has 200 columns, which is why I don't want to run the model for each coefficient separately.

I also had a look at ashr, but testing against a threshold is not implemented there.

Best, Frederik

shrinkage apeglm • 1.1k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 2 hours ago
United States

Two issues I see.

  1. You don't need the log(2) -- that's for the GLM.
  2. The marginal samples variance of y is not the residual standard error. param should be the residual standard error here.

Not really, you can only pull out FSR, svalues and credible intervals for one at a time. See ?apeglm and the arguments coef, mle, and prior.control. You can in theory shrink simultaneously if you prespecified the scale of the Cauchy prior, but the function is only set up to estimate the prior for mle for one coef at a time.

With 200 columns, what you could do is scale them, then figure out a good scale for the prior from running apeglm on a subset, then use that scale for all 200. This is closer to what bayesglm does (uses a common prior across all non-intercept coefficients).

ADD COMMENT
0
Entering edit mode

Thank you, I'll try that out!

ADD REPLY

Login before adding your answer.

Traffic: 965 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6