#23-001 --Lin Hu, Anqi Li, Xu Tan
A Rational Inattention Theory of Echo Chamber
Abstract
JEL Classification
D83, D85
#23-002 -- Clemens Possnig
Reinforcement Learning and Collusion
Abstract
This paper presents an analytical characterization of the long run policies learned by algorithms that interact repeatedly. These algorithms update policies which are maps from observed states to actions. I show that the long run policies correspond to equilibria that are stable points of a tractable differential equation. As a running example, I consider a repeated Cournot game of quantity competition, for which learning the stage game Nash equilibrium serves as non-collusive benchmark. I give necessary and sufficient conditions for this Nash equilibrium not to be learned. These are requirements on the state variables algorithms use to determine their actions, and on the stage game. When algorithms determine actions based only on the past period’s price, the Nash equilibrium can be learned. However, agents may condition their actions on richer types of information beyond the past period’s price. In that case, I give sufficient conditions such that the policies converge with positive probability to a collusive equilibrium, while never converging to the Nash equilibrium.
JEL Classification
C73, D43, D83,
#23-003 -- Clemens Possnig
Learning to Best Reply: On the Consistency of Multi-Agent Batch Reinforcement Learning
Abstract
his paper provides asymptotic results for a class of model-free actor-critic batch - reinforcement learning algorithms in the multi-agent setting. At each period, each agent faces an estimation problem (the critic, e.g. a value function), and a policy updating problem. The estimation step is done by parametric function estimation based on a batch of past observations. Agents have no knowledge of each others incentives and policies. I provide sufficient conditions for each agent's parametric function estimator to be consistent in the multi-agent environment, which enables agents to learn to best respond despite the non-stationarity inherent in multi-agent systems. The conditions depend on the environment, batch size, and policy step size. These sufficient conditions are useful in the asymptotic analysis of multi-agent learning, e.g. in the application of long-run characterisations using stochastic approximation techniques.
#23-004 -- Federico Echenique and Anqi Li
Rationally Inattentive Statistical Discrimination: Arrow Meets Phelps
Abstract
When information acquisition is costly but flexible, a principal may rationally acquire information that favors “majorities” over “minorities” unless the latter are strictly more productive than the former. Majorities therefore face incentives to invest in becoming productive, whereas minorities are discouraged from such investments. The principal, in turn, focuses scarce attentional resources on majorities precisely because they are likely to invest. We give conditions under which the resulting discriminatory equilibrium is most preferred by the principal, despite that all groups are ex-ante identical. Our results add to the discussions of affirmative action, implicit bias, and occupational segregation and stereotypes.
JEL classification
D82, D86, D31, J71
#23-005 -- Lin Hu, Matthew Kovach and Anqi Li
Learning News Bias: Misspecifications and Consequences
Abstract
We study how a decision maker (DM) learns about the bias of unfamiliar news sources. Absent any frictions, a rational DM uses known sources as a yardstick to discern the true bias of a source. If a DM has misspecified beliefs, this process fails. We derive long-run beliefs, behavior, welfare, and corresponding comparative statics, when the DM has dogmatic, incorrect beliefs about the bias of known sources. The distortion due to misspecified learning is succinctly captured by a single-dimensional metric we introduce. Our model generates the hostile media effect and false polarization, and has implications for fact-checking and misperception recalibration.