Name: Fangzheng Huang
Date: Dec 15, 2022
Time: 10:00am
Location: online
Supervisors: Dayan Ban
Title: Cyclic Style Generative Adversarial Network for Near Infrared and Visible Light Face Recognition
Abstract: Face recognition in the visible light (VIS) spectrum has been widely utilized in many practical applications. With the development of the deep learning method, the recognition accuracy and speed have already reached an excellent level, where face recognition can be applied in various circumstances. However, in some extreme situations, there are still problems that face recognition cannot guarantee performance. One of the most significant cases is under poor illumination. Lacking light sources, images cannot show the true identities of detected people. To address such a problem, the near infrared (NIR) spectrum offers an alternative solution to face recognition in which face images can be captured clearly. Studies have been made in recent years, and current near infrared and visible light (NIR-VIS) face recognition methods have achieved great performance.
In this thesis, I review current NIR-VIS face recognition methods and public NIR-VIS face datasets. I first list public NIR-VIS face datasets that are used in most research. For each dataset, I represent their characteristics, including the number of subjects, collection environment, resolution of images, and whether paired or not. Also, I conclude evaluation protocols for each dataset, helping with further analyzing of performances. Then, I classify current NIR-VIS face recognition methods into three categories, image synthesis-based methods, subspace learning-based methods, and invariant feature-based methods. The contribution of each method is concisely explained. Additionally, I make comparisons between current NIR-VIS face recognition methods and propose my own opinion on the advantages and disadvantages of these methods.
To improve the shortcomings of current methods, this thesis proposes a new model, Cyclic Style Generative Adversarial Network (CS-GAN), which is a combination of image synthesis-based method and subspace learning-based method. The proposed CS-GAN improves the visualization results of image synthesis between the NIR domain and VIS domain as well as recognition accuracy. The CS-GAN is based on the Style-GAN 3 network which was proposed in 2021. In the proposed model, there are two generators from pre-trained Style-GAN 3 which generate images in the NIR domain and VIS domain, respectively. The generators consist of a mapping network and synthesis network, where the mapping network disentangles the latent code for reducing correlation between features, and the synthesis network synthesizes face images through progressive growing training. The generators have different final layers, a to-RGB layer for the VIS domain and a to-grayscale layer for the NIR domain. Generators are embedded in a cyclic structure, in which latent codes are sent into the synthesis network in the other generator for recreated images, and recreated images are compared with real images which in the same domain to ensure domain consistency. Besides, I apply the proposed cyclic subspace learning. The cyclic subspace learning is composed of two parts. The first part introduces the proposed latent loss which is to have better controls over the learning of latent subspace. The latent codes influence both details and locations of features through continuously inputting into the synthesis network. The control over latent subspace can strengthen the feature consistency between synthesized images. And the second part improves the style-transferring process by controlling high-level features with perceptual loss in each domain. In the perceptual loss, there is a pre-trained VGG-16 network to extract high-level features which can be regarded as the style of the images. Therefore, style loss can control the style of images in both domains as well as ensure style consistency between synthesized images and real images. The visualization results show that the proposed CS-GAN model can synthesize better VIS images that are detailed, corrected colorized, and with clear edges. More importantly, the experimental results show that the Rank-1 accuracy on CASISA NIR-VIS 2.0 database reaches 99.60% which improves state-of-the-art methods by 0.2%.