Research

Four Interconnected Research Directions

Guided by the SPIN framework, our research spans from theoretical foundations to practical system development, with conditional mutual information (CMI) as a central unifying instrument.


Information-Theoretic Foundations of Deep Learning

Deep neural networks viewed as nonlinear information-processing systems. We use conditional mutual information (CMI) as a unifying instrument — measuring cluster concentration, governing susceptibility to knowledge distillation, and connecting classification performance to the geometry of learned output distributions. Our frameworks (CMIC-DL, CMIM-DL, KD amplification) improve accuracy, robustness, and model IP protection.

CMI Constrained DL   Knowledge Distillation   

Model Security   Adversarial Robustness

Learn more about: Information-Theoretic Foundations of Deep Learning.


AI-Inspired Extensions of Information Theory

Large language models are extremely high-order conditional sequence models that provide, for the first time, a concrete realization of the semantic source models classical information theory lacked. We develop distributional semantic information theory, where semantic meaning is represented through contextual probability distributions and learned embeddings. These developments suggest the possibility of a new chapter for information theory.

Semantic IT   Distributional Semantics   LLMs   Embedding

Learn more about: AI-Inspired Extensions of Information Theory.


Beyond Text: Toward Extended Large Language Models

The predictive principles of LLMs are far more general than text. Drawing on the lab's decades of foundational work in image and video coding, we develop extended LLMs for multimodal data — images and videos represented as sequences of tokens using information-theoretically principled tokenization. This opens new questions about visual semantics and unified predictive models.

Multimodal LLMs   Visual Tokenization   

Image & Video Coding    Visual Semantics

Learn more about: Beyond Text: Toward Extended Large Language Models


Engineering AI for Nonlinear Systems

Many scientific and engineering systems are fundamentally nonlinear, yet theoretical tools for nonlinear systems remain far less developed than for linear ones. We develop neural architectures with rigorous theoretical foundations for nonlinear function approximation — moving from existential guarantees toward constructive principles that specify both architecture and nonlinear operator design. Applications include PDEs and high-dimensional function approximation.

Nonlinear Modeling   Activation Functions   PDE Solving   

Engineering AI

Learn more about: Engineering AI for Nonlinear Systems.