Grad Seminar: Towards Urban Digital Twins With Gaussian Splatting, Large-Language-Models, and Cloud Mapping Services
Abstract
Urban Digital Twin cities are powerful tools leveraging 3D modelling, multi-source data, and online geographic information systems to assist urban planning, infrastructure management, and decision-making. However, traditional 3D reconstruction methods face significant limitations in meeting the demands of next-generation Digital Twins. These conventional approaches struggle with scalability, lack photorealism, and struggle with computational efficiency. This thesis addresses these challenges by proposing a comprehensive framework that leverages recent advances in AI-driven 3D reconstruction and large language models (LLMs), paving the way for efficient, accurate, and user accessible Urban Digital Twins of the future.
To address the challenge of generating accurate and efficient large-scale outdoor 3D point clouds from remote sensing images, this thesis introduces a novel pipeline that combines Google Earth imagery with Gaussian Splatting techniques. This approach enables dense and photorealistic 3D reconstruction at the city scale, significantly improving view synthesis quality and reconstruction accuracy compared to traditional photogrammetry, multi-view stereo, and neural radiance field methods. The proposed framework is demonstrated through the reconstruction of the City of Waterloo, representing one of the first large-scale applications of Gaussian Splatting in urban modelling.
For the challenge of efficiently generating precise 3D mesh representations of individual buildings, a new pipeline is developed that integrates open-set foundational object segmentation models with the latest Gaussian Splatting techniques. This enables the reconstruction of building meshes from multi-view 2D images using simple text or click-based prompts. Unlike dense point clouds, mesh representations are better suited for simulations and numerical modelling. The pipeline leverages Google Earth Studio, allowing users to generate detailed 3D building models from basic inputs such as an address, postal code, geographic coordinates, or place name.
To improve the accessibility and usability of Digital Twin systems, this thesis presents the Digital Twin Buildings framework, combining mesh-based 3D reconstruction with cloud mapping services and a novel multi-agent system powered by large language models. This framework enables the automated retrieval, visualization, and analysis of 3D building models alongside GIS-integrated data. It supports natural language interactions, automated code generation, and multi-modal analytics. A case study on air quality analysis during a major wildfire event illustrates the system's potential to support decision-making and interdisciplinary urban research.
By unifying these contributions, this thesis demonstrates a significant step toward accurate and efficient digital twin systems for urban modelling. It provides a blueprint for developing the next generation of Urban Digital Twins accessible to users through AI assisted tools supporting natural language interactions.
Presenter
Kyle Gao, PhD candidate in Systems Design Engineering