Please note: This PhD seminar will take place in MC 1085.
Zhongwen
(Rex)
Zhang,
PhD
candidate
David
R.
Cheriton
School
of
Computer
Science
Supervisor: Professor Yuri Boykov
The shortage of pixel-level labels motivates weakly-supervised methods for training segmentation networks. This presentation is focused on supervision from image-level class tags. In the context of image classification networks, training with such labels constitutes full supervision. But, training a segmentation network to produce pixel-accurate image labeling when only image-level tags are available is a challenging problem. In particular, since each training image typically contains objects of many different classes, the network has to match class tags to different patterns representing such objects (e.g., grass is green and the sky is blue, not vice-versa).
Such a combinatorial problem (related to Multiple Instance Learning) has to be resolved by the trained network in addition to learning good deep features discriminating the classes. Common methodologies convert pixel-level predictions of the trained segmentation network to image-level predictions (e.g., generalized pooling) and/or develop techniques for extracting pixel-localization information from image-level classification (e.g., Class Activation Maps). The presentation discusses relevant unsupervised and weakly-supervised loss functions and reviews a number of relevant papers, including from this year’s CVPR.