In the package ConsensusClusterPlus, there is an option to input a pre-computed distance matrix to speed up the computation time. In the reference manual, it states this is because ConsensusClusterPlus re-calculates a distance matrix for each iteration.
Thus, I have pre-computed the distance matrix for a very large dataset (~700 samples with ~50,000 rows). However, when I input this distance object into ConsensusClusterPlus, the computation time is dramatically INCREASED and struggles to get past the first iteration. Of note, the "dist" object is very large for this large dataset (approx. 4-6 gb). Although, given the distance is pre-calculated, wouldn't this save time during consensus clustering?
Any ideas would be great. Thanks.