PhD Defence • Computer Vision | Machine Learning • Unsupervised Losses for Clustering and Segmentation of Images: Theories & Optimization Algorithms | Cheriton School of Computer Science

Please note: This PhD defence will take place online.

Zhongwen (Rex) Zhang, PhD candidate
David R. Cheriton School of Computer Science

Supervisor: Professor Yuri Boykov

Unsupervised losses are common for tasks with limited human annotations. In clustering, they are used to group data without any labels. In semi-supervised or weakly-supervised learning, they are applied to the unannotated part of the training data. In self-supervised settings, they are used for representation learning. They appear in diverse forms enforcing different prior knowledge. However, formulating and optimizing such losses poses challenges. Firstly, translating prior knowledge into mathematical formulations can be non-trivial. Secondly, the properties of standard losses may not be obvious across different tasks. Thirdly, standard optimization algorithms may not work effectively or efficiently, thus requiring the development of customized algorithms.

This thesis addresses several related classification and segmentation problems in computer vision, using unsupervised image- or pixel-level losses under a shortage of labels. First, we focus on the entropy-based decisiveness as a standard unsupervised loss for softmax models. While discussing it in the context of clustering, we prove that it leads to margin maximization, typically associated with supervised learning. In the context of weakly-supervised semantic segmentation, we combine decisiveness with the standard pairwise regularizer, the Potts model. We study the conceptual and empirical properties of different relaxations of the latter. For both clustering and segmentation problems, we provide new self-labeling optimization algorithms for the corresponding unsupervised losses. Unlike related prior work, we use soft hidden labels that can represent the estimated class uncertainty. Training network models with such soft pseudo-labels motivates a new form of cross-entropy maximizing the probability of “collision” between the predicted and estimated classes. The proposed losses and algorithms achieve the state-of-the-art on standard benchmarks. The thesis also introduces new geometrically motivated unsupervised losses for estimating thin structures, e.g., complex vasculature trees at near-capillary resolution in 3D medical data.