Following this paper, after filtering the methylated and unmetylated count matrix using the filter
keep <- rowSums(counts_total >= 10) == 100
my DGE object is about 6e5 rows x 100 columns. If I then further filter to promoters I can carry out the differential methylation analysis in predefined gene promoters outlined in the paper without issue, however I cannot estimate dispersion and carry out DMR analysis without doing so because estimateDisp takes too long to finish (it just doesn't, for me). For clarity , the command is
I don't work on RRBS data, but I can (sort of) address the edgeR parts of your post. modelMatrixMeth generates an additive design matrix, meaning that glmFit in edgeR must use the mglmLevenberg algorithm rather than the faster mglmOneWay algorithm. This is exacerbated by the number of samples in your dataset, which results in a proportional number of coefficients in the design matrix. The Levenberg algorithm performs a Cholesky decomposition, which has cubic time complexity with respect to the number of coefficients. So you can really see how everything slows right down - the number of rows doesn't help, either.
There are several options for speeding up the analysis. The first is to parallelize the GLM fits, assuming you have a cluster over which the calculations could be distributed. This doesn't really solve the cubic scaling but it would probably increase the range of datasets for which calculations are still feasible. The second approach is to improve the efficiency of the Cholesky decomposition, e.g., using Matrix:::dsCMatrix_Cholesky for sparse matrices. This is... possible, but not straightforward. Both of these solutions are developer-side, so there's not much that can be done on your end.
In the meantime, I suggest implementing a "poor man's parallelization", e.g., break up the analysis by chromosome and submit each of these as a separate job. There should still be enough regions on each chromosome for empirical Bayes to be effective. I have used a similar strategy for Hi-C data analyses with edgeR, when the genome-by-genome interaction space was too large to be loaded.