Statistics & Biostatistics seminar series
Guo
(Hugo)
Yu Link to join seminar: Hosted on Zoom |
Reluctant interaction modeling in generalized linear models
Analyzing contemporary high-dimensional datasets often leads to extremely large-scale interaction modeling problems, where the challenge is posed to identify important interactions among billions of candidate pairwise interactions. While several methods have recently been proposed to tackle this challenge, they are mostly designed by (1) focusing on linear models with interactions and (or) (2) assuming the hierarchy assumption. In practice, however, neither of these two building blocks has to hold. We propose an interaction modeling framework in generalized linear models (GLMs) which is free of any assumptions on hierarchy. The basic premise is a non-trivial extension of the reluctant interaction modeling framework in linear models (Yu, et al, 2019), where main effects are preferred over interactions if all else is equal, to the GLMs setting. The proposed method is easy to implement, and is highly scalable to large-scale datasets. Theoretically, we show that the proposed method successfully recovers all the important interactions with high probability. Both the favorable computational and statistical properties are demonstrated through empirical studies.