The Larger, the Better—or the Larger, the Worse? The Hidden Distortion of Measurement Error in Ultrahigh-Dimensional Data

Abstract

While standard statistical methodologies often rely on the tacit assumption that observed data are error-free, real-world data frequently suffer from various forms of measurement error arising from data collection procedures, reporting inaccuracies, privacy-preserving mechanisms, and intrinsic variability. Although widely acknowledged, measurement error is often ignored in statistical analysis, despite its potential to seriously distort results. This talk introduces key concepts of measurement error and explains why common beliefs, such as its effects being negligible or always leading to attenuation, can be misleading. We examine how measurement error distorts statistical inference, particularly in ultrahigh-dimensional settings, where it can induce complex and nonstandard effects. The discussion focuses on a function-on-scalar regression framework and highlights distinct impacts of different measurement error processes in ultrahigh- versus finite-dimensional settings.