Speaker
Dr. Robi Polikar, professor, Electrical and Computer Engineering, Rowan University, Glassboro, NJ
Topic
SEMI-SUPERVISED AND ACTIVE LEARNING IN INITIALLY LABELED NONSTATIONARY AND EVOLVING ENVIRONMENTS
Abstract
An increasing number of real-world applications are associated with streaming data drawn from drifting and nonstationary distributions that change over time. These applications demand new algorithms that can learn and adapt to such changes, also known as concept drift. Proper characterization of such data with existing approaches typically requires substantial amount of labeled instances, which may be difficult, expensive or even impractical to obtain. Such a scenario is also related to the problem known as verification latency, where the labels of the training data are not available until much later than the data itself – or in extreme verification latency that we discuss in this talk – they may never be available. In the first half of this lecture, we will introduce COMPOSE, a density tracking framework for learning from nonstationary streaming data, where labels are unavailable (or presented very sporadically) after initialization. We will discuss the algorithm in detail, as well as its results and performances on real-world datasets as well as several carefully constructed synthetic datasets, which demonstrate the ability of the algorithm to learn under several different scenarios of initially labeled streaming environments (ILSE). Furthermore, we also demonstrate that COMPOSE is competitive even with a well-established, fully supervised, nonstationary learning algorithms that receive labeled data in every batch. COMPOSE, like all algorithms, make certain assumptions on the data distribution, the most important of which is the “limited-drift” assumption, where it assumes that any class distribution at two consecutive time-steps changes very little, i.e., the drift is gradual. In addition to such cases as abrupt drift, COMPOSE also cannot address special cases such as introduction of a new class or significant overlap among existing classes, as such scenarios cannot be learned without additional labeled data. Scenarios that provide occasional or periodic limited labeled data are not uncommon, however, for which many of COMPOSE’s restrictions can be lifted. In the second part of this talk, I will briefly introduce an alternate version of COMPOSE as a proof-of-concept algorithm that can identify the instances whose labels – if available – would be most beneficial, and then combine those instances with unlabeled data to actively learn from streaming nonstationary data, even when the distribution of the data experiences abrupt changes
Speaker's biography
Robi Polikar received his B.Sc. degree in electronics and telecommunications engineering from Istanbul Technical University, Istanbul, Turkey in 1993, and his M.S. and Ph.D. degrees, both co-majors in biomedical engineering and electrical engineering, from Iowa State, in 1995 and in 2000, respectively. In 2001 he joined Electrical and Computer Engineering at the then newly established College of Engineering of Rowan University, in Glassboro, NJ, where he established the Signal Processing and Pattern Recognition Laboratory (SPPRL). In 2003, he received the National Science Foundation’s CAREER award – for developing incremental learning algorithms from streaming data. Since then, he received subsequent grants from NSF, first to develop algorithms that can also learn in nonstationary environments, even for severely imbalanced datasets, and then for semi-supervised and unsupervised learning in initially labeled environments. His current area of research interest includes adaptive intelligent systems and their various novel applications, such as incremental learning, nonstationary learning, data fusion, imbalanced data and the missing feature problem in automated decision making. He is also working on applying novel machine learning algorithms to biomedical applications, such as early diagnosis of Alzheimer’s disease, brain-computer interface, and bioinformatics. He teaches upper level undergraduate and graduate courses in wavelet theory, pattern recognition, neural networks, signal processing, and biomedical systems at Rowan. In 2012 he was awarded the Professional Progress in Engineering Award by Iowa State University, recognizing an outstanding alumnus in midcareer. He is a senior member of IEEE, and an Associate Editor for IEEE Transactions on Neural Networks and Learning Systems, for which he recently guest edited a special issue on learning in nonstationary and evolving environments. He is also a program evaluator for Accreditation Board for Engineering Technology (ABET).