I am getting a segfault when analyzing a 16S metagenomics data sets with edgeR, from the glmQLFit function. The error is:
Error: segfault from C stack overflow
Other data sets run fine for me in this installation of edgeR, it's just this particular data set that seems to be getting the error. This data set does have a large number of samples (155) and a small number of genomic features (13 bacterial orders), so I'm not sure if those dimensions are part of the issue. Any help would be welcome.
Thanks, Mark
These are the commands I am running:
library(edgeR)
dat = read.delim("microbial_data.txt",row.names=1)
meta = read.delim("microbial_metadata.txt",row.names=1)
f1 = factor(meta[,1])
f2 = factor(meta[,2])
f3 = factor(meta[,3])
f4 = factor(meta[,4])
design = model.matrix(~ f1 + f2 + f3 + f4)
dge = DGEList(dat)
dge = calcNormFactors(dge)
dge = estimateDisp(dge, design)
fit = glmQLFit(dge, design)
Here is the sessionInfo():
R version 4.5.2 (2025-10-31)
Platform: x86_64-apple-darwin20
Running under: macOS Sequoia 15.6.1
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.5-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.1
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: America/Chicago
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] edgeR_4.8.0 limma_3.66.0
loaded via a namespace (and not attached):
[1] compiler_4.5.2 grid_4.5.2 locfit_1.5-9.12 lattice_0.22-7
[5] statmod_1.5.1
The data sets, as well as above R script and sessionInfo, are available here: Box link to data
