Assess fit of variance prior
2
1
Entering edit mode
@frederik-ziebell-14676
Last seen 27 days ago
Heidelberg, Germany

Since limma is also applied to data types other than microarrays, e.g. MS-based proteomics data, is it advisable to asses the fit of the variance prior before running limma? I found that even for data that follows the limma model, it is difficult to judge the fit of the prior based on a QQ-plot. In the example below, the fit does not seem to be very good although the estimates of df.prior and s2.prior are very close to the true value.

library("limma")

plot_prior_fit <- function(fit){

  q_chisq <- function(p) {qchisq(p, df = fit$df.prior)}

  y <- fit$df.prior*fit$s2.prior/fit$sigma^2

  qqplot(
    x = q_chisq(ppoints(y)),
    y = y,
    xlab = "Theoretical Quantiles", ylab = "Sample Quantiles"
  )
  qqline(y, distribution = q_chisq, col = "red")  
}

n_genes <- 5000
n_samples_per_group <- 4

set.seed(1)
sigma2 <- 0.05 / rchisq(n_genes, df = 10) * 10
y <- matrix(rnorm(n_genes*2*n_samples_per_group, sd = sqrt(sigma2)), nrow = n_genes, ncol = 2*n_samples_per_group)
design <- cbind(Intercept = 1, Group = c(rep(0, n_samples_per_group), rep(1, n_samples_per_group)))
# 1% of genes differentially expressed
y[seq_len(floor(n_genes/100)), seq_len(n_samples_per_group)] <- y[seq_len(floor(n_genes/100)), seq_len(n_samples_per_group)] + 1
fit <- lmFit(y,design)
fit <- eBayes(fit)

plot_prior_fit(fit)

enter image description here

limma • 455 views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 6 hours ago
WEHI, Melbourne, Australia

is it advisable to asses the fit of the variance prior before running limma?

No, that isn't necessary. Bayesian analyses are extremely robust to the shape of the prior distribution. In any case, assessing the fit of the prior distribution directly is impossible because it relates to quantities that are not observed.

If this sort of checking was necessary or useful, then it would explained in the limma User's Guide or workflows.

In the example below, the fit does not seem to be very good

You have not compared like with like. limma assumes the observed variances to follow a scaled F-distribution, as explained in the published papers.

ADD COMMENT

Login before adding your answer.

Traffic: 475 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6