Tutte colloquium-Kate Larson

Title: Soft Condorcet Optimization

Speaker:	Kate Larson
Affiliation:	University of Waterloo
Location:	MC 5501

Abstract: A common way to drive the progress of AI models and agents is to compare their performance on standardized benchmarks. This often involves aggregating individual performances across a potentially wide variety of tasks and benchmarks and many of the leaderboards that draw greatest attention are Elo-based.

In this paper, we describe a novel ranking scheme inspired by social choice frameworks, called Soft Condorcet Optimization (SCO), to compute the optimal ranking of agents: the one that makes the fewest mistakes in predicting the agent comparisons in the evaluation data. This optimal ranking is the maximum likelihood estimate when evaluation data (which we view as votes) are interpreted as noisy samples from a ground truth ranking, a solution to Condorcet's original voting system criteria and inherits desirable social-choice inspired properties since SCO ratings are maximal for Condorcet winners when they exist, which we show is not necessarily true for the classical rating system Elo.

Contact Info