Grad Seminar: Transformer-based Point Cloud Processing and Analysis for LiDAR Remote Sensing
Abstract
The processing and analysis of Light Detection and Ranging (LiDAR) point cloud data, a fundamental task in Three-Dimensional (3D) computer vision, is essential for a wide range of remote sensing applications. However, the disorder, sparsity, and uneven spatial distribution of LiDAR point clouds pose significant challenges to effective and efficient processing. In recent years, Transformers have demonstrated notable advantages over traditional deep learning methods in computer vision, yet designing Transformer-based frameworks tailored to point clouds remains an underexplored topic. This thesis investigates the potential of Transformer models for accurate and efficient LiDAR point cloud processing.
Firstly, a 3D Global-Local (GLocal) Transformer Network (3DGTN) is introduced to capture both local and global context, thereby enhancing model accuracy for LiDAR data. This design not only ensures a comprehensive understanding of point cloud characteristics but also establishes a foundation for subsequent efficient Transformer frameworks.
Secondly, a fast point Transformer network with Dynamic Token Aggregation (DTA-Former) is proposed to improve model speed. By optimizing point sampling, grouping, and reconstruction, DTA-Former substantially reduces the time complexity of 3DGTN while retaining its strong accuracy.
Finally, to further reduce time and space complexity, a 3D Learnable Supertoken Transformer (3DLST) is presented. Building on DTA-Former, 3DLST employs a novel supertoken clustering strategy that lowers computational overhead and memory consumption, achieving state-of-the-art performance across multi-source LiDAR point cloud tasks in terms of both accuracy and efficiency.
These Transformer-based frameworks contribute to more robust and scalable LiDAR point cloud processing solutions, supporting diverse remote sensing applications such as urban planning, environmental monitoring, and autonomous navigation. By enabling efficient yet high-accuracy analysis of large-scale 3D data, this work fosters further research and innovation in LiDAR remote sensing.
Presenter
Dening Lu, PhD candidate in Systems Design Engineering
Join online