Dear all,
I performed an EWAS using limma, and applied bacon to assess and correct for bias and inflation. I generated QQ plots using bacon; however, I am unsure whether the uncorrected QQ plot is being displayed correctly.
Specifically, the minimum uncorrected p-value in my results is approximately 9e-136, but in the QQ plot the observed values appear to reach around 1e-200, which seems inconsistent.
>min(pval)
[1] 9.46751e-136 # uncorrected
>min(pval(bc))
[1] 3.595391e-131 # corrected
In addition, in the QQ plots, both the corrected and uncorrected results show observed -log(p) values that are much larger than the expected -log(p) values. Despite this, the estimated inflation factors are relatively modest.
>bc
Bacon-object containing 1 set(s) of 748089 test-statistics.
...estimated bias: 0.23.
...estimated inflation: 1.2. (uncorrected inflation)
>bc_bacon
Bacon-object containing 1 set(s) of 748089 test-statistics.
...estimated bias: 0.0021.
...estimated inflation: 1.0. (corrected inflation)
How should this apparent discrepancy between the QQ plot patterns and the estimated inflation values be interpreted?
# load packages
library(readxl)
library(openxlsx)
library(limma)
library(QCEWAS)
library(bacon)
library(ggplot2)
library(dplyr)
# Load test set
load(file="D:/Study/Estimation/Preprocessed_Data/SNUH_Mvalue.RData")
pheno <- read.csv("D:/Study/Estimation/Biological age & methylation/EWAS/Pheno.SNUH.csv")
# leave phenotype data only target variable and covariates
pheno <- pheno[,c(1,2,4,14,16:22)]
pheno <- na.omit(pheno)
# preprocess data
myBeta <- SNUH_M[,pheno$BG_ID]
pheno$Sentrix_ID <- factor(pheno$Sentrix_ID)
smoking_labels <- factor(pheno$smoking_3,levels=c("1","2","3"))
# EWAS using limma
design <- model.matrix(~age_sample+LOG10_BMI+smoking_labels+CD8T+CD4T+NK+Bcell+Mono+Neu+Sentrix_ID, pheno)
fit_raw <- lmFit(myBeta, design)
gc()
## 1. bacon_QC
coef <- fit_raw$coefficients[,2]
se <- fit_raw$stdev.unscaled[, 2] * fit_raw$sigma
tstat <- coef / se
pval <- 2 * pt(-abs(tstat), fit_raw$df.residual)
n <- ncol(design) + fit_raw$df.residual
bc <- bacon(teststatistics = tstat,
effectsizes = coef,
standarderrors = se,
verbose = TRUE)
bc
#plot
histo <- plot(bc, type="hist")
ggsave("D:/Study/Estimation/Biological age & methylation/EWAS/Age/260113/QC_data/age_bacon_histogram.png", dpi=800)
qq <- plot(bc, type="qq")
ggsave("D:/Study/Estimation/Biological age & methylation/EWAS/Age/260113/QC_data/age_bacon_qq.png", dpi=800)
## 2. bacon_correction
pval_bacon <- bacon::pval(bc)
tstat_bacon <- bacon::tstat(bc)
bc_bacon <- bacon::bacon(teststatistics = tstat_bacon,
effectsizes = coef,
standarderrors = se,
verbose = TRUE)
bc_bacon
> sessionInfo()
R version 4.5.2 (2025-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 26200)
other attached packages:
bacon_1.38.0

This is qq plot and histogram generated before bacon correction