Department seminar by Lu Yang, University of Amsterdam
Diagnostics for Regression Models with Discrete Outcomes
Making informed decisions about model adequacy has been an outstanding issue for regression models with discrete outcomes. Standard residuals such as Pearson and deviance residuals for such outcomes often show a large discrepancy from the hypothesized pattern even under the true model and are not informative especially when data are highly discrete. To fill this gap, we propose a surrogate empirical residual distribution function for general discrete (e.g. ordinal and count) outcomes that serves as an alternative to the empirical Cox-Snell residual distribution function. When at least one continuous covariate is available, we show asymptotically that the proposed function converges uniformly to the identity function under the correctly specified model, even with highly discrete (e.g. binary) outcomes. Through simulation studies, we demonstrate empirically that the proposed surrogate empirical residual distribution function is highly effective for various diagnostic tasks, since it is close to the hypothesized pattern under the true model and significantly departs from this pattern under model misspecification.