I am trying to run edgeR and limma for very high sample sample size groups (group1 sample size = 236 and group2 sample size = 490). The glmQLFit() in edgeR and voom() + lmFit() in limma-voom takes about 2 min when using two confounding variables. For even higher sample sizes where each group has above 1000 samples, that function takes several minutes.
I am wondering can this function be optimized so as to reduce the compute time? Is there any way we can parallelize the computation in this function so as to decrease the computation time?
Thank you.
If 2 minutes is inacceptable to you then you can still fall back to limma-trend which should give a dramatic speedup and finish in seconds. Inference if often similar.