<?xml version="1.0" encoding="UTF-8"?><xml><records><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>13</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Seamus Somerstep</style></author><author><style face="normal" font="default" size="100%">Felipe Maia Polo</style></author><author><style face="normal" font="default" size="100%">Allysson Flavio Melo de Oliveira</style></author><author><style face="normal" font="default" size="100%">Prattyush Mangal</style></author><author><style face="normal" font="default" size="100%">Mírian Silva</style></author><author><style face="normal" font="default" size="100%">Onkar Bhardwaj</style></author><author><style face="normal" font="default" size="100%">Mikhail Yurochkin</style></author><author><style face="normal" font="default" size="100%">Subha Maity</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">CARROT: A Cost Aware Rate Optimal Router</style></title></titles><dates><year><style  face="normal" font="default" size="100%">2025</style></year></dates><urls><web-urls><url><style face="normal" font="default" size="100%">https://arxiv.org/abs/2502.03261</style></url></web-urls></urls><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">With the rapid growth in the number of Large Language Models (LLMs), there has been a recent interest in LLM routing, or directing queries to the cheapest LLM that can deliver a suitable response. Following this line of work, we introduce CARROT, a Cost AwaRe Rate Optimal rouTer that can select models based on any desired trade-off between performance and cost. Given a query, CARROT selects a model based on estimates of models' cost and performance. Its simplicity lends CARROT computational efficiency, while our theoretical analysis demonstrates minimax rate-optimality in its routing performance. Alongside CARROT, we also introduce the Smart Price-aware Routing (SPROUT) dataset to facilitate routing on a wide spectrum of queries with the latest state-of-the-art LLMs. Using SPROUT and prior benchmarks such as Routerbench and open-LLM-leaderboard-v2 we empirically validate CARROT's performance against several alternative routers.</style></abstract></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>13</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Subha Maity</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Estimation with missing not at random binary outcomes via exponential tilts</style></title></titles><dates><year><style  face="normal" font="default" size="100%">2025</style></year></dates><urls><web-urls><url><style face="normal" font="default" size="100%">https://arxiv.org/abs/2502.06046</style></url></web-urls></urls><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">We study the problem of missing not at random (MNAR) datasets with binary outcomes. We propose an exponential tilt based approach that bypasses any knowledge on 'nonresponse instruments' or 'shadow variables' that are usually required for statistical estimation. We establish a sufficient condition for identifiability of tilt parameters and propose an algorithm to estimate them. Based on these tilt parameter estimates, we propose importance weighted and doubly robust estimators for any mean functions of interest, and validate their performances in a synthetic dataset. In an experiment with the Waterbirds dataset, we utilize our tilt framework to perform unsupervised transfer learning, when the responses are missing from a target domain of interest, and achieve a prediction performance that is comparable to a gold standard.</style></abstract></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Roy, Saptarshi</style></author><author><style face="normal" font="default" size="100%">Maity, Subha</style></author><author><style face="normal" font="default" size="100%">Xue, Songkai</style></author><author><style face="normal" font="default" size="100%">Yurochkin, Mikhail</style></author><author><style face="normal" font="default" size="100%">Sun, Yuekai</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">How does overparametrization affect performance on minority groups?</style></title><secondary-title><style face="normal" font="default" size="100%">Transactions on Machine Learning Research</style></secondary-title></titles><keywords><keyword><style  face="normal" font="default" size="100%">Computer Science - Machine Learning</style></keyword><keyword><style  face="normal" font="default" size="100%">Mathematics - Statistics Theory</style></keyword><keyword><style  face="normal" font="default" size="100%">Statistics - Statistics Theory</style></keyword></keywords><dates><year><style  face="normal" font="default" size="100%">2025</style></year></dates><urls><web-urls><url><style face="normal" font="default" size="100%">https://openreview.net/pdf?id=POunezXgvF</style></url></web-urls></urls><number><style face="normal" font="default" size="100%">arXiv:2206.03515</style></number><publisher><style face="normal" font="default" size="100%">arXiv</style></publisher><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">The benefits of overparameterization for the overall performance of modern machine learning (ML) models are well known. However, the effect of overparameterization at a more granular level of data subgroups is less understood. Recent empirical studies demonstrate encouraging results: (i) when groups are not known, overparameterized models trained with empirical risk minimization (ERM) perform better on minority groups; (ii) when groups are known, ERM on data subsampled to equalize group sizes yields state-of-the-art worst-group accuracy in the overparameterized regime. In this paper, we complement these empirical studies with a theoretical investigation of the risk of overparameterized random feature regression models on minority groups with identical feature distribution as the majority group. In a setting in which the regression functions for the majority and minority groups are different, we show that overparameterization either improves or does not harm the asymptotic minority group performance under the ERM setting when the features are distributed uniformly over the sphere.</style></abstract></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>10</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Bracale, Daniele</style></author><author><style face="normal" font="default" size="100%">Maity, Subha</style></author><author><style face="normal" font="default" size="100%">Banerjee, Moulinath</style></author><author><style face="normal" font="default" size="100%">Sun, Yuekai</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Learning the Distribution Map in Reverse Causal Performative Prediction</style></title><secondary-title><style face="normal" font="default" size="100%">International Conference on Artificial Intelligence and Statistics</style></secondary-title></titles><keywords><keyword><style  face="normal" font="default" size="100%">Computer Science - Machine Learning</style></keyword><keyword><style  face="normal" font="default" size="100%">Statistics - Machine Learning</style></keyword></keywords><dates><year><style  face="normal" font="default" size="100%">2025</style></year></dates><urls><web-urls><url><style face="normal" font="default" size="100%">http://arxiv.org/abs/2405.15172</style></url></web-urls></urls><number><style face="normal" font="default" size="100%">arXiv:2405.15172</style></number><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">In numerous predictive scenarios, the predictive model affects the sampling distribution; for example, job applicants often meticulously craft their resumes to navigate through a screening systems. Such shifts in distribution are particularly prevalent in the realm of social computing, yet, the strategies to learn these shifts from data remain remarkably limited. Inspired by a microeconomic model that adeptly characterizes agents' behavior within labor markets, we introduce a novel approach to learn the distribution shift. Our method is predicated on a reverse causal model, wherein the predictive model instigates a distribution shift exclusively through a finite set of agents' actions. Within this framework, we employ a microfoundation model for the agents' actions and develop a statistically justified methodology to learn the distribution shift map, which we demonstrate to be effective in minimizing the performative prediction risk.</style></abstract></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>47</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Seamus Somerstep</style></author><author><style face="normal" font="default" size="100%">Ya'acov Ritov</style></author><author><style face="normal" font="default" size="100%">Mikhail Yurochkin</style></author><author><style face="normal" font="default" size="100%">Subha Maity</style></author><author><style face="normal" font="default" size="100%">Yuekai Sun</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Limitations of refinement methods for weak to strong generalization</style></title><secondary-title><style face="normal" font="default" size="100%">Conference on Language Modeling (COLM)</style></secondary-title></titles><dates><year><style  face="normal" font="default" size="100%">2025</style></year></dates><urls><web-urls><url><style face="normal" font="default" size="100%">https://openreview.net/pdf?id=OKvSnV5Ar7</style></url></web-urls></urls><language><style face="normal" font="default" size="100%">eng</style></language></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>10</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Bracale, Daniele</style></author><author><style face="normal" font="default" size="100%">Maity, Subha</style></author><author><style face="normal" font="default" size="100%">Polo, Felipe Maia</style></author><author><style face="normal" font="default" size="100%">Somerstep, Seamus</style></author><author><style face="normal" font="default" size="100%">Banerjee, Moulinath</style></author><author><style face="normal" font="default" size="100%">Sun, Yuekai</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Microfoundation Inference for Strategic Prediction</style></title><secondary-title><style face="normal" font="default" size="100%">International Conference on Artificial Intelligence and Statistics</style></secondary-title></titles><keywords><keyword><style  face="normal" font="default" size="100%">Computer Science - Machine Learning</style></keyword><keyword><style  face="normal" font="default" size="100%">Statistics - Machine Learning</style></keyword><keyword><style  face="normal" font="default" size="100%">Statistics - Methodology</style></keyword></keywords><dates><year><style  face="normal" font="default" size="100%">2025</style></year></dates><urls><web-urls><url><style face="normal" font="default" size="100%">https://proceedings.mlr.press/v258/bracale25a.html</style></url></web-urls></urls><number><style face="normal" font="default" size="100%">arXiv:2411.08998</style></number><publisher><style face="normal" font="default" size="100%">arXiv</style></publisher><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">Often in prediction tasks, the predictive model itself can influence the distribution of the target variable, a phenomenon termed performative prediction. Generally, this influence stems from strategic actions taken by stakeholders with a vested interest in predictive models. A key challenge that hinders the widespread adaptation of performative prediction in machine learning is that practitioners are generally unaware of the social impacts of their predictions. To address this gap, we propose a methodology for learning the distribution map that encapsulates the long-term impacts of predictive models on the population. Specifically, we model agents' responses as a cost-adjusted utility maximization problem and propose estimates for said cost. Our approach leverages optimal transport to align pre-model exposure (ex ante) and post-model exposure (ex post) distributions. We provide a rate of convergence for this proposed estimate and assess its quality through empirical demonstrations on a credit-scoring dataset.</style></abstract></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Manli Cheng</style></author><author><style face="normal" font="default" size="100%">Subha Maity</style></author><author><style face="normal" font="default" size="100%">Qinglong Tian</style></author><author><style face="normal" font="default" size="100%">Pengfei Li</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Transfer Learning under Group-Label Shift: A Semiparametric Exponential Tilting Approach</style></title></titles><dates><year><style  face="normal" font="default" size="100%">2025</style></year></dates><urls><web-urls><url><style face="normal" font="default" size="100%">https://arxiv.org/abs/2509.22268</style></url></web-urls></urls><language><style face="normal" font="default" size="100%">eng</style></language></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>47</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Ngweta, Lilian</style></author><author><style face="normal" font="default" size="100%">Agarwal, Mayank</style></author><author><style face="normal" font="default" size="100%">Maity, Subha</style></author><author><style face="normal" font="default" size="100%">Gittens, Alex</style></author><author><style face="normal" font="default" size="100%">Sun, Yuekai</style></author><author><style face="normal" font="default" size="100%">Yurochkin, Mikhail</style></author></authors><secondary-authors><author><style face="normal" font="default" size="100%">Al-Onaizan, Yaser</style></author><author><style face="normal" font="default" size="100%">Bansal, Mohit</style></author><author><style face="normal" font="default" size="100%">Chen, Yun-Nung</style></author></secondary-authors></contributors><titles><title><style face="normal" font="default" size="100%">Aligners: Decoupling LLMs and Alignment</style></title><secondary-title><style face="normal" font="default" size="100%">Findings of the Association for Computational Linguistics: EMNLP 2024</style></secondary-title></titles><dates><year><style  face="normal" font="default" size="100%">2024</style></year></dates><urls><web-urls><url><style face="normal" font="default" size="100%">https://aclanthology.org/2024.findings-emnlp.808</style></url></web-urls></urls><publisher><style face="normal" font="default" size="100%">Association for Computational Linguistics</style></publisher><pages><style face="normal" font="default" size="100%">13785–13802</style></pages><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">Large Language Models (LLMs) need to be aligned with human expectations to ensure their safety and utility in most applications. Alignment is challenging, costly, and needs to be repeated for every LLM and alignment criterion. We propose to decouple LLMs and alignment by training *aligner* models that can be used to align any LLM for a given criteria on an as-needed basis, thus also reducing the potential negative impacts of alignment on performance. Our recipe for training the aligner models solely relies on synthetic data generated with a (prompted) LLM and can be easily adjusted for a variety of alignment criteria. We use the same synthetic data to train *inspectors*, binary miss-alignment classification models to guide a *squad* of multiple aligners. Our empirical results demonstrate consistent improvements when applying aligner squad to various LLMs, including chat-aligned models, across several instruction-following and red-teaming datasets.</style></abstract></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>47</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Maity, Subha</style></author><author><style face="normal" font="default" size="100%">Agarwal, Mayank</style></author><author><style face="normal" font="default" size="100%">Yurochkin, Mikhail</style></author><author><style face="normal" font="default" size="100%">Sun, Yuekai</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">An Investigation of Representation and Allocation Harms in Contrastive Learning</style></title><secondary-title><style face="normal" font="default" size="100%">The Twelfth International Conference on Learning Representations</style></secondary-title></titles><dates><year><style  face="normal" font="default" size="100%">2024</style></year></dates><urls><web-urls><url><style face="normal" font="default" size="100%">https://openreview.net/forum?id=q4SiDyYQbo</style></url></web-urls></urls><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">The effect of underrepresentation on the performance of minority groups is known to be a serious problem in supervised learning settings; however, it has been underexplored so far in the context of self-supervised learning (SSL). In this paper, we demonstrate that contrastive learning (CL), a popular variant of SSL, tends to collapse representations of minority groups with certain majority groups. We refer to this phenomenon as representation harm and demonstrate it on image and text datasets using the corresponding popular CL methods. Furthermore, our causal mediation analysis of allocation harm on a downstream classification task reveals that representation harm is partly responsible for it, thus emphasizing the importance of studying and mitigating representation harm. Finally, we provide a theoretical explanation for representation harm using a stochastic block model that leads to a representational neural collapse in a contrastive learning setting.</style></abstract></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Maity, Subha</style></author><author><style face="normal" font="default" size="100%">Dutta, Diptavo</style></author><author><style face="normal" font="default" size="100%">Terhorst, Jonathan</style></author><author><style face="normal" font="default" size="100%">Sun, Yuekai</style></author><author><style face="normal" font="default" size="100%">Banerjee, Moulinath</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">A linear adjustment-based approach to posterior drift in transfer learning</style></title><secondary-title><style face="normal" font="default" size="100%">Biometrika</style></secondary-title></titles><dates><year><style  face="normal" font="default" size="100%">2024</style></year></dates><urls><web-urls><url><style face="normal" font="default" size="100%">https://doi.org/10.1093/biomet/asad029</style></url></web-urls></urls><number><style face="normal" font="default" size="100%">1</style></number><volume><style face="normal" font="default" size="100%">111</style></volume><pages><style face="normal" font="default" size="100%">31–50</style></pages><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">We present new models and methods for the posterior drift problem where the regression function in the target domain is modelled as a linear adjustment, on an appropriate scale, of that in the source domain, and study the theoretical properties of our proposed estimators in the binary classification problem. The core idea of our model inherits the simplicity and the usefulness of generalized linear models and accelerated failure time models from the classical statistics literature. Our approach is shown to be flexible and applicable in a variety of statistical settings, and can be adopted for transfer learning problems in various domains including epidemiology, genetics and biomedicine. As concrete applications, we illustrate the power of our approach (i) through mortality prediction for British Asians by borrowing strength from similar data from the larger pool of British Caucasians, using the UK Biobank data, and (ii) in overcoming a spurious correlation present in the source domain of the Waterbirds dataset.</style></abstract></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>47</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Polo, Felipe Maia</style></author><author><style face="normal" font="default" size="100%">Maity, Subha</style></author><author><style face="normal" font="default" size="100%">Yurochkin, Mikhail</style></author><author><style face="normal" font="default" size="100%">Banerjee, Moulinath</style></author><author><style face="normal" font="default" size="100%">Sun, Yuekai</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Weak Supervision Performance Evaluation via Partial Identification</style></title><secondary-title><style face="normal" font="default" size="100%">The Thirty-eighth Annual Conference on Neural Information Processing Systems</style></secondary-title></titles><dates><year><style  face="normal" font="default" size="100%">2024</style></year></dates><urls><web-urls><url><style face="normal" font="default" size="100%">https://openreview.net/forum?id=VOVyeOzZx0</style></url></web-urls></urls><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">Programmatic Weak Supervision (PWS) enables supervised model training without direct access to ground truth labels, utilizing weak labels from heuristics, crowdsourcing, or pre-trained models. However, the absence of ground truth complicates model evaluation, as traditional metrics such as accuracy, precision, and recall cannot be directly calculated. In this work, we present a novel method to address this challenge by framing model evaluation as a partial identification problem and estimating performance bounds using Fréchet bounds. Our approach derives reliable bounds on key metrics without requiring labeled data, overcoming core limitations in current weak supervision evaluation techniques. Through scalable convex optimization, we obtain accurate and computationally efficient bounds for metrics including accuracy, precision, recall, and F1-score, even in high-dimensional settings. This framework offers a robust approach to assessing model quality without ground truth labels, enhancing the practicality of weakly supervised learning for real-world applications.</style></abstract></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>10</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Bakshi, Soham</style></author><author><style face="normal" font="default" size="100%">Maity, Subha</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Bayes classifier cannot be learned from noisy responses with unknown noise rates</style></title><secondary-title><style face="normal" font="default" size="100%">The Eleventh International Conference on Learning Representations, Tiny paper track</style></secondary-title></titles><dates><year><style  face="normal" font="default" size="100%">2023</style></year></dates><urls><web-urls><url><style face="normal" font="default" size="100%">https://openreview.net/forum?id=U4o5iSWSaD</style></url></web-urls></urls><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">Training a classifier with noisy labels typically requires the learner to specify the distribution of label noise, which is often unknown in practice. Although there have been some recent attempts to relax that requirement, we show that the Bayes decision rule is unidentified in most classification problems with noisy labels. This suggests it is generally not possible to bypass/relax the requirement. In the special cases in which the Bayes decision rule is identified, we develop a simple algorithm to learn the Bayes decision rule, that does not require knowledge of the noise distribution.</style></abstract></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>47</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Maity, Subha</style></author><author><style face="normal" font="default" size="100%">Mukherjee, Debarghya</style></author><author><style face="normal" font="default" size="100%">Banerjee, Moulinath</style></author><author><style face="normal" font="default" size="100%">Sun, Yuekai</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Predictor-corrector algorithms for stochastic optimization under gradual distribution shift</style></title><secondary-title><style face="normal" font="default" size="100%">The Eleventh International Conference on Learning Representations</style></secondary-title></titles><dates><year><style  face="normal" font="default" size="100%">2023</style></year></dates><urls><web-urls><url><style face="normal" font="default" size="100%">https://openreview.net/forum?id=2SV2dlfBuE3</style></url></web-urls></urls><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">Time-varying stochastic optimization problems frequently arise in machine learning practice (e.g., gradual domain shift, object tracking, strategic classification). Often, the underlying process that drives the distribution shift is continuous in nature. We exploit this underlying continuity by developing predictor-corrector algorithms for time-varying stochastic optimization that anticipates changes in the underlying data generating process through a predictor-corrector term in the update rule. The key challenge is the estimation of the predictor-corrector term; a naive approach based on sample-average approximation may lead to non-convergence. We develop a general moving-average based method to estimate the predictor-corrector term and provide error bounds for the iterates, both in presence of pure and noisy access to the queries from the relevant derivatives of the loss function. Furthermore, we show (theoretically and empirically in several examples) that our method outperforms non-predictor corrector methods that do not anticipate changes in the data generating process.</style></abstract></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>47</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Ngweta, Lilian</style></author><author><style face="normal" font="default" size="100%">Maity, Subha</style></author><author><style face="normal" font="default" size="100%">Gittens, Alex</style></author><author><style face="normal" font="default" size="100%">Sun, Yuekai</style></author><author><style face="normal" font="default" size="100%">Yurochkin, Mikhail</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Simple Disentanglement of Style and Content in Visual Representations</style></title><secondary-title><style face="normal" font="default" size="100%">Proceedings of the 40th International Conference on Machine Learning</style></secondary-title></titles><dates><year><style  face="normal" font="default" size="100%">2023</style></year></dates><urls><web-urls><url><style face="normal" font="default" size="100%">https://proceedings.mlr.press/v202/ngweta23a.html</style></url></web-urls></urls><publisher><style face="normal" font="default" size="100%">PMLR</style></publisher><volume><style face="normal" font="default" size="100%">202</style></volume><pages><style face="normal" font="default" size="100%">26063–26086</style></pages><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">Learning visual representations with interpretable features, i.e., disentangled representations, remains a challenging problem. Existing methods demonstrate some success but are hard to apply to large-scale vision datasets like ImageNet. In this work, we propose a simple post-processing framework to disentangle content and style in learned representations from pre-trained vision models. We model the pre-trained features probabilistically as linearly entangled combinations of the latent content and style factors and develop a simple disentanglement algorithm based on the probabilistic model. We show that the method provably disentangles content and style features and verify its efficacy empirically. Our post-processed features yield significant domain generalization performance improvements when the distribution shift occurs due to style changes or style-related spurious correlations.</style></abstract><notes><style face="normal" font="default" size="100%">ISSN: 2640-3498</style></notes></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>47</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Maity, Subha</style></author><author><style face="normal" font="default" size="100%">Yurochkin, Mikhail</style></author><author><style face="normal" font="default" size="100%">Banerjee, Moulinath</style></author><author><style face="normal" font="default" size="100%">Sun, Yuekai</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Understanding new tasks through the lens of training data via exponential tilting</style></title><secondary-title><style face="normal" font="default" size="100%">The Eleventh International Conference on Learning Representations</style></secondary-title></titles><dates><year><style  face="normal" font="default" size="100%">2023</style></year></dates><urls><web-urls><url><style face="normal" font="default" size="100%">https://openreview.net/forum?id=DBMttEEoLbw</style></url></web-urls></urls><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">Deploying machine learning models on new tasks is a major challenge due to differences in distributions of the train (source) data and the new (target) data. However, the training data likely captures some of the properties of the new task. We consider the problem of reweighing the training samples to gain insights into the distribution of the target task. Specifically, we formulate a distribution shift model based on the exponential tilt assumption and learn train data importance weights minimizing the KL divergence between labeled train and unlabeled target datasets. The learned train data weights can then be used for downstream tasks such as target performance evaluation, fine-tuning, and model selection. We demonstrate the efficacy of our method on Waterbirds and Breeds benchmarks.</style></abstract></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Maity, Subha</style></author><author><style face="normal" font="default" size="100%">Sun, Yuekai</style></author><author><style face="normal" font="default" size="100%">Banerjee, Moulinath</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Meta-analysis of heterogeneous data: integrative sparse regression in high-dimensions</style></title><secondary-title><style face="normal" font="default" size="100%">Journal of Machine Learning Research</style></secondary-title></titles><dates><year><style  face="normal" font="default" size="100%">2022</style></year></dates><urls><web-urls><url><style face="normal" font="default" size="100%">http://jmlr.org/papers/v23/21-0739.html</style></url></web-urls></urls><number><style face="normal" font="default" size="100%">198</style></number><volume><style face="normal" font="default" size="100%">23</style></volume><pages><style face="normal" font="default" size="100%">1–50</style></pages><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">We consider the task of meta-analysis in high-dimensional settings in which the data sources are similar but non-identical. To borrow strength across such heterogeneous datasets, we introduce a global parameter that emphasizes interpretability and statistical efficiency in the presence of heterogeneity. We also propose a one-shot estimator of the global parameter that preserves the anonymity of the data sources and converges at a rate that depends on the size of the combined dataset. For high-dimensional linear model settings, we demonstrate the superiority of our identification restrictions in adapting to a previously seen data distribution as well as predicting for a new/unseen data distribution. Finally, we demonstrate the benefits of our approach on a large-scale drug treatment dataset involving several different cancer cell-lines.</style></abstract></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Maity, Subha</style></author><author><style face="normal" font="default" size="100%">Sun, Yuekai</style></author><author><style face="normal" font="default" size="100%">Banerjee, Moulinath</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Minimax optimal approaches to the label shift problem in non-parametric settings</style></title><secondary-title><style face="normal" font="default" size="100%">Journal of Machine Learning Research</style></secondary-title></titles><dates><year><style  face="normal" font="default" size="100%">2022</style></year></dates><urls><web-urls><url><style face="normal" font="default" size="100%">http://jmlr.org/papers/v23/21-1519.html</style></url></web-urls></urls><number><style face="normal" font="default" size="100%">346</style></number><volume><style face="normal" font="default" size="100%">23</style></volume><pages><style face="normal" font="default" size="100%">1–45</style></pages><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">We study the minimax rates of the label shift problem in non-parametric classification. In addition to the unsupervised setting in which the learner only has access to unlabeled examples from the target domain, we also consider the setting in which a small number of labeled examples from the target domain is available to the learner. Our study reveals a difference in the difficulty of the label shift problem in the two settings, and we attribute this difference to the availability of data from the target domain to estimate the class conditional distributions in the latter setting. We also show that a class proportion estimation approach is minimax rate-optimal in the unsupervised setting.</style></abstract></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>47</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Kwon, Bum Chul</style></author><author><style face="normal" font="default" size="100%">Kartoun, Uri</style></author><author><style face="normal" font="default" size="100%">Khurshid, Shaan</style></author><author><style face="normal" font="default" size="100%">Yurochkin, Mikhail</style></author><author><style face="normal" font="default" size="100%">Maity, Subha</style></author><author><style face="normal" font="default" size="100%">Brockman, Deanna G</style></author><author><style face="normal" font="default" size="100%">Khera, Amit V</style></author><author><style face="normal" font="default" size="100%">Ellinor, Patrick T</style></author><author><style face="normal" font="default" size="100%">Lubitz, Steven A</style></author><author><style face="normal" font="default" size="100%">Ng, Kenney</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">RMExplorer: A Visual Analytics Approach to Explore the Performance and the Fairness of Disease Risk Models on Population Subgroups</style></title><secondary-title><style face="normal" font="default" size="100%">IEEE Visualization and Visual Analytics (VIS)</style></secondary-title></titles><keywords><keyword><style  face="normal" font="default" size="100%">Analytical models</style></keyword><keyword><style  face="normal" font="default" size="100%">Atrial Fibrillation</style></keyword><keyword><style  face="normal" font="default" size="100%">Biological system modeling</style></keyword><keyword><style  face="normal" font="default" size="100%">Computational modeling</style></keyword><keyword><style  face="normal" font="default" size="100%">Data visualization</style></keyword><keyword><style  face="normal" font="default" size="100%">electronic health records</style></keyword><keyword><style  face="normal" font="default" size="100%">explainability</style></keyword><keyword><style  face="normal" font="default" size="100%">fairness</style></keyword><keyword><style  face="normal" font="default" size="100%">health informatics</style></keyword><keyword><style  face="normal" font="default" size="100%">Human-centered computing</style></keyword><keyword><style  face="normal" font="default" size="100%">interpretability</style></keyword><keyword><style  face="normal" font="default" size="100%">sociology</style></keyword><keyword><style  face="normal" font="default" size="100%">subgroup analysis</style></keyword><keyword><style  face="normal" font="default" size="100%">visual analytics</style></keyword><keyword><style  face="normal" font="default" size="100%">Visualization</style></keyword></keywords><dates><year><style  face="normal" font="default" size="100%">2022</style></year></dates><urls><web-urls><url><style face="normal" font="default" size="100%">https://ieeexplore.ieee.org/abstract/document/9973226</style></url></web-urls></urls><pages><style face="normal" font="default" size="100%">50–54</style></pages><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">Disease risk models can identify high-risk patients and help clinicians provide more personalized care. However, risk models de-veloped on one dataset may not generalize across diverse subpop-ulations of patients in different datasets and may have unexpected performance. It is challenging for clinical researchers to inspect risk models across different subgroups without any tools. Therefore, we developed an interactive visualization system called RMExplorer (Risk Model Explorer) to enable interactive risk model assessment. Specifically, the system allows users to define subgroups of patients by selecting clinical, demographic, or other characteristics, to ex-plore the performance and fairness of risk models on the subgroups, and to understand the feature contributions to risk scores. To demonstrate the usefulness of the tool, we conduct a case study, where we use RMExplorer to explore three atrial fibrillation risk models by applying them to the UK Biobank dataset of 445,329 individuals. RMExplorer can help researchers to evaluate the performance and biases of risk models on subpopulations of interest in their data.</style></abstract><notes><style face="normal" font="default" size="100%">ISSN: 2771-9553</style></notes></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Bhattacharyya, Rupam</style></author><author><style face="normal" font="default" size="100%">Burman, Anik</style></author><author><style face="normal" font="default" size="100%">Singh, Kalpana</style></author><author><style face="normal" font="default" size="100%">Banerjee, Sayantan</style></author><author><style face="normal" font="default" size="100%">Maity, Subha</style></author><author><style face="normal" font="default" size="100%">Auddy, Arnab</style></author><author><style face="normal" font="default" size="100%">Rout, Sarit Kumar</style></author><author><style face="normal" font="default" size="100%">Lahoti, Supriya</style></author><author><style face="normal" font="default" size="100%">Panda, Rajmohan</style></author><author><style face="normal" font="default" size="100%">Baladandayuthapani, Veerabhadran</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Role of multiresolution vulnerability indices in COVID-19 spread in India: a Bayesian model-based analysis</style></title><secondary-title><style face="normal" font="default" size="100%"> BMJ Open</style></secondary-title></titles><keywords><keyword><style  face="normal" font="default" size="100%">covid-19</style></keyword><keyword><style  face="normal" font="default" size="100%">epidemiology</style></keyword><keyword><style  face="normal" font="default" size="100%">Public health</style></keyword></keywords><dates><year><style  face="normal" font="default" size="100%">2022</style></year></dates><urls><web-urls><url><style face="normal" font="default" size="100%">https://bmjopen.bmj.com/content/12/11/e056292</style></url></web-urls></urls><number><style face="normal" font="default" size="100%">11</style></number><volume><style face="normal" font="default" size="100%">12</style></volume><pages><style face="normal" font="default" size="100%">e056292</style></pages><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">Objectives COVID-19 has differentially affected countries, with health infrastructure and other related vulnerability indicators playing a role in determining the extent of its spread. Vulnerability of a geographical region to COVID-19 has been a topic of interest, particularly in low-income and middle-income countries like India to assess its multifactorial impact on incidence, prevalence or mortality. This study aims to construct a statistical analysis pipeline to compute such vulnerability indices and investigate their association with metrics of the pandemic growth. Design Using publicly reported observational socioeconomic, demographic, health-based and epidemiological data from Indian national surveys, we compute contextual COVID-19 Vulnerability Indices (cVIs) across multiple thematic resolutions for different geographical and spatial administrative regions. These cVIs are then used in Bayesian regression models to assess their impact on indicators of the spread of COVID-19. Setting This study uses district-level indicators and case counts data for the state of Odisha, India. Primary outcome measure We use instantaneous R (temporal average of estimated time-varying reproduction number for COVID-19) as the primary outcome variable in our models. Results Our observational study, focussing on 30 districts of Odisha, identified housing and hygiene conditions, COVID-19 preparedness and epidemiological factors as important indicators associated with COVID-19 vulnerability. Conclusion Having succeeded in containing COVID-19 to a reasonable level during the first wave, the second wave of COVID-19 made greater inroads into the hinterlands and peripheral districts of Odisha, burdening the already deficient public health system in these areas, as identified by the cVIs. Improved understanding of the factors driving COVID-19 vulnerability will help policy makers prioritise resources and regions, leading to more effective mitigation strategies for the present and future.</style></abstract><notes><style face="normal" font="default" size="100%">Publisher: British Medical Journal Publishing Group Section: Public health</style></notes></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>47</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Maity, Subha</style></author><author><style face="normal" font="default" size="100%">Mukherjee, Debarghya</style></author><author><style face="normal" font="default" size="100%">Yurochkin, Mikhail</style></author><author><style face="normal" font="default" size="100%">Sun, Yuekai</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Does enforcing fairness mitigate biases caused by subpopulation shift?</style></title><secondary-title><style face="normal" font="default" size="100%">Advances in Neural Information Processing Systems</style></secondary-title></titles><dates><year><style  face="normal" font="default" size="100%">2021</style></year></dates><urls><web-urls><url><style face="normal" font="default" size="100%">https://proceedings.neurips.cc/paper/2021/hash/d800149d2f947ad4d64f34668f8b20f6-Abstract.html</style></url></web-urls></urls><publisher><style face="normal" font="default" size="100%">Curran Associates, Inc.</style></publisher><volume><style face="normal" font="default" size="100%">34</style></volume><pages><style face="normal" font="default" size="100%">25773–25784</style></pages><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">Many instances of algorithmic bias are caused by subpopulation shifts. For example, ML models often perform worse on demographic groups that are underrepresented in the training data. In this paper, we study whether enforcing algorithmic fairness during training improves the performance of the trained model in the \textbackslashemph{target domain}. On one hand, we conceive scenarios in which enforcing fairness does not improve performance in the target domain. In fact, it may even harm performance. On the other hand, we derive necessary and sufficient conditions under which enforcing algorithmic fairness leads to the Bayes model in the target domain. We also illustrate the practical implications of our theoretical results in simulations and on real data.</style></abstract></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>47</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Maity, Subha</style></author><author><style face="normal" font="default" size="100%">Xue, Songkai</style></author><author><style face="normal" font="default" size="100%">Yurochkin, Mikhail</style></author><author><style face="normal" font="default" size="100%">Sun, Yuekai</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Statistical inference for individual fairness</style></title><secondary-title><style face="normal" font="default" size="100%">The Ninth International Conference on Learning Representations</style></secondary-title></titles><dates><year><style  face="normal" font="default" size="100%">2021</style></year></dates><urls><web-urls><url><style face="normal" font="default" size="100%">https://openreview.net/forum?id=z9k8BWL-_2u</style></url></web-urls></urls><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">As we rely on machine learning (ML) models to make more consequential decisions, the issue of ML models perpetuating unwanted social biases has come to the fore of the public's and the research community's attention. In this paper, we focus on the problem of detecting violations of individual fairness in ML models. We formalize the problem as measuring the susceptibility of ML models against a form of adversarial attack and develop a suite of inference tools for the adversarial loss. The tools allow practitioners to assess the individual fairness of ML models in a statistically-principled way: form confidence intervals for the adversarial loss and test hypotheses of model fairness with (asymptotic) non-coverage/Type I error rate control. We demonstrate the utility of our tools in a real-world case study.</style></abstract></record></records></xml>