Certifying clusters from sum-of-norms clustering


Sum-of-norms clustering is a clustering formulation based on convex optimization that automatically induces hierarchy. Multiple algorithms have been proposed to solve the optimization problem: subgradient descent by Hocking et al., ADMM and ADA by Chi and Lange, stochastic incremental algorithm by Panahi et al. and semismooth Newton-CG augmented Lagrangian method by Sun et al. All algorithms yield approximate solutions, even though an exact solution is demanded to determine the correct cluster assignment. The purpose of this paper is to close the gap between the output from existing algorithms and the exact solution to the optimization problem. We present a clustering test that identifies and certifies the correct cluster assignment from an approximate solution yielded by any primal-dual algorithm. Our certification validates clustering for both unit and multiplicative weights. The test may not succeed if the approximation is inaccurate. However, we show the correct cluster assignment is guaranteed to be certified by a primal-dual path following algorithm after sufficiently many iterations, provided that the model parameter λ avoids a finite number of bad values. Numerical experiments are conducted on Gaussian mixture and half-moon data, which indicate that carefully chosen multiplicative weights increase the recovery power of sum-of-norms clustering.


Publisher's Version

Last updated on 01/09/2023