Name: Xikai Tang
Date: Dec 23, 2022
Time: 2:00pm
Location: online
Supervisors: Dayan Ban, Zhou, Wang
Title: Spatial and Channel Attention-based 3D Object Classification Research for 3D Point Clouds
Abstract: Deep learning has been widely used in Two Dimensional (2D) computer vision and has led to the realization that machine learning techniques have become one of the key research directions for future scientific research. In 2D computer vision, CNN, RNN, SENet, Transformer, as well as many other algorithms show amazing results in 2D data. With the accelerating development of computer version technologies, the exploitation of 2D data is insufficient for machine learning research and researchers considering the transfer of 2D computer vision algorithms to Three Dimensional (3D) domain. Point clouds is an important expression of 3D data. The more detailed information found in 3D point cloud data compared to 2D point cloud data, it has accelerated research in recent years, which has led to significant breakthroughs in artificial intelligence, deep learning, autonomous driving, tracking, and other domains. There have been a large number of deep learning methods recently proposed based on point clouds. PointNet, P4Transformer, and SampleNet show significant success in 3D domain. Disorder and sparse shape make a challenge in designing deep neural networks for point clouds processing.
In chapter one, we will introduce the background of point clouds, the existing public datasets and evaluation metrics, then investigate and analyze deep learning methods based on classification of point clouds. In chapter two, we will introduce generation of point clouds and analyse the existing methods based on classification and segmentation. Furthermore, we investigate attention mechanism in computer vision, includes background of attention mechanism, evolution of attention mechanism, spatial and channel attention in vision and point cloud-based attention model in deep learning. Based on the chapter one and two analyse and investigation, we found that this data type’s ability to provide depth information, point sparsity and disorder pose a challenge in designing appropriate deep neural networks to process them and it is still challenging to explore local relationships in point clouds data. so, in chapter three, in order to better extract features and obtain geometric information we will propose a point attention (PointAT) model and propose attention value (AT value) model for feature fusion to apply geometric relationship to the data. Then, we propose a new spatial and channel attention-based network (SCA). The SCA is the overall structure of the network, and the main purpose is to connect PointAT and AT value model, then capturing meaningful geometric information by applying the geometric relationship between point clouds patches to the model, then propose an auto pooling framework to extract global features. In this work, we concentrate on learning geometric relationship between point cloud data. For this purpose, we introduce a point attention model based on spatial and channel attention to learn the geometric relationship between point clouds, and further combine the geometric relationship with the point cloud data by the AT Value Model. Finally, we introduce an adaptive downsampling structure, Autopooling. This downsampling structure considers each point’s importance weight and picking key points adaptively, which can be used with convolutional networks. Extensive experiments conducted on two benchmark datasets (ModelNet40 and ShapeNet) clearly demonstrate the effectiveness of our SCA and SCA-Auto (SCAA with Auto pooling) methods. Finally, in chapter four, we summary our contribution, and significant of study findings and limitations of proposed methods. Then, we get future research directions based on our analyse and investigation.