Master’s Thesis Presentation • Artificial Intelligence • Weakly-supervised Semantic Segmentation with Regularized Loss Hyperparameter Search

Thursday, September 9, 2021 1:00 pm - 1:00 pm EDT (GMT -04:00)

Please note: This master’s thesis presentation will be given online.

Zongliang Ji, Master’s candidate
David R. Cheriton School of Computer Science

Supervisor: Professor Olga Veksler

Weakly supervised segmentation significantly reduces user annotation effort. Recently, regularized loss was proposed for single object class segmentation under image-level weak supervision. Regularized loss consists of several components. Each component, if used in isolation, would lead to some trivial solution. However, a weighted combination of the loss components introduces a balance between the individual biases. The weight of each component in regularized loss is controlled by a hyperparameter. We propose an approach that searches for regularized loss hyperparameters. The main idea is to set the most important regularized loss component to a high weight while ensuring the other loss components are set to weights just sufficiently high to prevent the trivial solution favoured by the most important component. Our approach results in a significantly improved performance over prior work with fixed hyperparameters and improves the state of the art in salient and semantic image level supervised segmentation.

In addition to image level weak supervision, we propose a new approach for semantic segmentation with weak supervision using bounding box annotations. Our new approach to weak supervision from bounding boxes also makes use of hyperparameter search regularized loss. Previous work on weak supervision from bounding boxes constructs pseudo-ground truth by segmenting each box into the object and the background for each box independently from all the other boxes in the dataset. We argue that the collection of boxes for the same class naturally provides a dataset from which we can learn the appearance of that object class. Learning a good appearance model, in turn, leads to a better segmentation of each individual box. Thus for each class, we propose to train a segmentation CNN as from the dataset consisting of the bounding boxes for that class using our proposed single object approach.

After we train these single-class CNNs, we apply them back to the training bounding boxes to obtain object/background segmentations and merge them to construct pseudo-ground truth. The obtained pseudo-ground truth is used for training a standard segmentation CNN. We improve the state of the art on Pascal VOC 2012 benchmark in bounding box weak supervision setting.

To join this master’s thesis presentation on MS Teams, please go to