Four Interconnected Research Directions
Guided by the SPIN framework, our research spans from theoretical foundations to practical system development, with conditional mutual information (CMI) as a central unifying instrument.
Information-Theoretic Foundations of Deep Learning
Deep neural networks viewed as nonlinear information-processing systems. We use conditional mutual information (CMI) as a unifying instrument — measuring cluster concentration, governing susceptibility to knowledge distillation, and connecting classification performance to the geometry of learned output distributions. Our frameworks (CMIC-DL, CMIM-DL, KD amplification) improve accuracy, robustness, and model IP protection.
CMI Constrained DL Knowledge Distillation
Model Security Adversarial Robustness
Learn more about: Information-Theoretic Foundations of Deep Learning.
AI-Inspired Extensions of Information Theory
Large language models are extremely high-order conditional sequence models that provide, for the first time, a concrete realization of the semantic source models classical information theory lacked. We develop distributional semantic information theory, where semantic meaning is represented through contextual probability distributions and learned embeddings. These developments suggest the possibility of a new chapter for information theory.
Semantic IT Distributional Semantics LLMs Embedding
Learn more about: AI-Inspired Extensions of Information Theory.
Beyond Text: Toward Extended Large Language Models
The predictive principles of LLMs are far more general than text. Drawing on the lab's decades of foundational work in image and video coding, we develop extended LLMs for multimodal data — images and videos represented as sequences of tokens using information-theoretically principled tokenization. This opens new questions about visual semantics and unified predictive models.
Multimodal LLMs Visual Tokenization
Image & Video Coding Visual Semantics
Learn more about: Beyond Text: Toward Extended Large Language Models
Engineering AI for Nonlinear Systems
Many scientific and engineering systems are fundamentally nonlinear, yet theoretical tools for nonlinear systems remain far less developed than for linear ones. We develop neural architectures with rigorous theoretical foundations for nonlinear function approximation — moving from existential guarantees toward constructive principles that specify both architecture and nonlinear operator design. Applications include PDEs and high-dimensional function approximation.
Nonlinear Modeling Activation Functions PDE Solving
Engineering AI
Learn more about: Engineering AI for Nonlinear Systems.