PhD Comprehensive Seminar | Peiyi Zheng, Toward Unified Scientific Discovery via a Multimodal Transformer for Symbolic Regression

Tuesday, November 18, 2025 12:30 pm - 1:30 pm EST (GMT -05:00)

Location

MC 6460

Candidate

Peiyi, Zheng | Applied Mathematics, University of Waterloo

Title

Toward Unified Scientific Discovery via a Multimodal Transformer for Symbolic Regression

Abstract

Symbolic regression (SR) aims to discover parsimonious and interpretable expressions that explain observational data. This task is fundamentally important to scientific discovery, as it moves beyond black-box predictions to uncover the underlying, interpretable mathematical laws governing a system. The core challenge lies in navigating the vast, discrete, and combinatorial search space of all possible mathematical formulas. Unlike standard regression, which optimizes parameters within a fixed model structure, SR must simultaneously discover both the symbolic structure of the expression and its numerical parameters.

Prevailing approaches typically decouple the learning of symbolic structures from numerical evidence. This separation, seen in methods like evolutionary search guided by post hoc scoring or in neural decoders trained in isolation from encoders, inherently limits cross-modal transfer, weakens model identifiability, and leads to weak generalization on out-of-distribution data. Therefore, we present a unified multimodal Transformer designed to process expressions and measurements within a single, shared self-attention stack. This architecture features modality-specific FFN experts and a fusion module to effectively integrate the two data types.