Bayesian Approaches to Dynamic Model Selection
In many applications, investigators monitor processes that vary in space and time, with the goal of identifying temporally persistent and spatially localized departures from a baseline or ``normal" behavior. In this talk, I will first discuss a principled Bayesian approach for estimating time varying functional connectivity networks from brain fMRI data. Dynamic functional connectivity, i.e., the study of how interactions among brain regions change dynamically over the course of an fMRI experiment, has recently received wide interest in the neuroimaging literature. Our method utilizes a hidden Markov model for classification of latent neurological states, achieving estimation of the connectivity networks in an integrated framework that borrows strength over the entire time course of the experiment. Furthermore, we assume that the graph structures, which define the connectivity states at each time point, are related within a super-graph, to encourage the selection of the same edges among related graphs. Then, I will propose a Bayesian nonparametric model selection approach with an application to the monitoring of pneumonia and influenza (P&I) mortality, to detect influenza outbreaks in the continental United States. More specifically, we introduce a zero-inflated conditionally identically distributed species sampling prior which allows borrowing information across time and to assign data to clusters associated to either a null or an alternate process. Spatial dependences are accounted for by means of a Markov random field prior, which allows to inform the selection based on inferences conducted at nearby locations. We show how the proposed modeling framework performs in an application to the P&I mortality data and in a simulation study, and compare with common threshold methods for detecting outbreaks over time, with more recent Markov switching based models, and with other Bayesian nonparametric priors that do not take into account spatio-temporal dependence.
Agent-based Asset Pricing, Learning, and Chaos
The Lucas asset pricing model is one of the most studied model in financial economics in the past decade. In our research, we relax the original assumptions in Lucas model of homogeneous agents and rational expectations. We populate an artificial economy with heterogeneous and boundedly rational agents. By defining a Correct Expectations Equilibrium, agents are able to compute their policy functions and the equilibrium pricing function without perfect information about the market. A natural adaptive learning scheme is given to agents to update their predictions. We examine the convergence of equilibrium with this learning scheme and show that the equilibrium is learnable (convergent) under certain parameter combinations. We also investigate the market dynamics when agents are out of equilibrium, including the cases where prices have excess volatility and the trading volume is high. Numerical simulations show that our system exhibits rich dynamics, including a whole cascade from period doubling bifurcations to chaos.
We are extending an invitation to a select group of talented undergraduate, graduate and PhD students to participate in the upcoming University of Waterloo Datathon.
Methods for High Dimensional Compositional Data Analysis in Microbiome Studies
Human microbiome studies using high throughput DNA sequencing generate compositional data with the absolute abundances of microbes not recoverable from sequence data alone. In compositional data analysis, each sample consists of proportions of various organisms with a unit sum constraint. This simple feature can lead traditional statistical methods when naively applied to produce errant results and spurious associations. In addition, microbiome sequence data sets are typically high dimensional, with the number of taxa much greater than the number of samples. These important features require further development of methods for analysis of high dimensional compositional data. This talk presents several latest developments in this area, including methods for estimating the compositions based on sparse count data, two-sample test for compositional vectors and regression analysis with compositional covariates. Several microbiome studies at the University of Pennsylvania are used to illustrate these methods and several open questions will be discussed.
Probabilistic approaches to mine association rules
Mining association rules is an important and widely applied data mining technique for discovering patterns in large datasets. However, the used support-confidence framework has some often overlooked weaknesses. This talk introduces a simple stochastic model and shows how it can be used in association rule mining. We apply the model to simulate data for analyzing the behavior and shortcomings of confidence and other measures of interestingness (e.g., lift). Based on these findings, we develop a new model-driven approach to mine rules based on the notion of NB-frequent itemsets, and we define a measure of interestingness which controls for spurious rules and has a strong foundation in statistical testing theory.