Seminar

Wednesday, October 7, 2020 11:00 am - 11:00 am EDT (GMT -04:00)

Department Seminar by Pieter Allaart

Please Note: This seminar will be given online.

Probability Seminar Series

Pieter Allaart, Professor
University of North Texas

Link to join seminar: Hosted on Webex.

On univoque and strongly univoque sets

Thursday, September 17, 2020 4:00 pm - 4:00 pm EDT (GMT -04:00)

Department seminar by Neil Spencer, Carnegie Mellon University

A new framework for modeling sparse networks that makes sense (and can actually be fit!)

Latent position models are a versatile tool when working with network data. Applications include clustering entities, network visualization, and controlling for unobserved causal confounding. In traditional treatments of the latent position model, the nodes’ latent positions are viewed as independent and identically distributed random variables. This assumption implies that the average node degree grows linearly with the number of nodes in the network, making it inappropriate when the network is sparse. In the first part of this talk, I will propose an alternative assumption—that the latent positions are generated according to a Poisson point process—and show that it is compatible with various levels of network sparsity. I will also provide theory establishing that the nodes’ latent positions can be consistently estimated, provided that the network isn't too sparse.  In the second part of the talk, I will consider the computational challenge of fitting latent position models to large datasets. I will describe a new Markov chain Monte Carlo strategy—based on a combination of split Hamiltonian Monte Carlo and Firefly Monte Carlo—that is much more efficient than the standard Metropolis-within-Gibbs algorithm for inferring the latent positions. Throughout the talk, I will use an advice-sharing network of elementary school teachers within a school district as a running example.

Please note: This talk will be hosted on Webex. To join please click on the following link: Department seminar by Neil Spencer.

Thursday, September 10, 2020 4:00 pm - 4:00 pm EDT (GMT -04:00)

Department seminar by Emma Jingfei Zhang, Miami University

Network Response Regression for Modeling Population of Networks with Covariates


Multiple-network data are fast emerging in recent years, where a separate network over a common set of nodes is measured for each individual subject, along with rich subject covariates information. Existing network analysis methods have primarily focused on modeling a single network, and are not directly applicable to multiple networks with subject covariates.

In this talk, we present a new network response regression model, where the observed networks are treated as matrix-valued responses, and the individual covariates as predictors. The new model characterizes the population-level connectivity pattern through a low-rank intercept matrix, and the parsimonious effects of subject covariates on the network through a sparse slope tensor. We formulate the parameter estimation as a non-convex optimization problem, and develop an efficient alternating gradient descent algorithm. We establish the non-asymptotic error bound for the actual estimator from our optimization algorithm. Built upon this error bound, we derive the strong consistency for network community recovery, as well as the edge selection consistency. We demonstrate the efficacy of our method through intensive simulations and two brain connectivity studies.

Join Zoom Meeting

Meeting ID: 844 283 6948
Passcode: 318995

Wednesday, August 19, 2020 4:00 pm - 5:00 pm EDT (GMT -04:00)

Student Seminar by Samuel Wong, Assistant Professor

Assessing the Impacts of Mutations to the Structure of COVID-19 Spike Protein via Sequential Monte Carlo


Proteins play a key role in facilitating the infectiousness of the 2019 novel coronavirus. A specific spike protein enables this virus to bind to human cells, and a thorough understanding of its 3-dimensional structure is therefore critical for developing effective therapeutic interventions. However, its structure may continue to evolve over time as a result of mutations. We take a data science perspective to study the potential structural impacts due to ongoing mutations in its amino acid sequence. To do so, we identify a key segment of the protein and apply a sequential Monte Carlo sampling method to detect possible changes to the space of low-energy conformations for different amino acid sequences. Such computational approaches can further our understanding of this important protein structure and complement laboratory efforts.

Please Note: This talk will be given through Microsoft Teams. To join please click here: Student Seminar by Samuel Wong

Thursday, August 13, 2020 4:00 pm - 4:00 pm EDT (GMT -04:00)

Department Seminar by Aaditya Ramdas, Carnegie Mellon University

Concentration inequalities for sampling without replacement, with applications to post-election audits


Many practical tasks involve sampling sequentially without replacement from a finite population in order to estimate some parameter, like a mean. We discuss how to derive powerful (new) concentration inequalities for this setting using martingale techniques, and apply it to auditing elections (see below).

This is joint work with my PhD student, Ian Waudby-Smith, who was an undergrad at UWaterloo. An early preprint is available here.

More details: When determining the outcome of an election, electronic voting machines are often employed for their tabulation speed and cost-effectiveness. Unlike paper ballots, these machines are vulnerable to software bugs and fraudulent tampering. Post-election audits provide assurance that announced electoral outcomes are consistent with paper ballots or voter-verifiable records. We propose an approach to election auditing based on confidence sequences (VACSINE)—these are visualizable sequences of confidence sets for the total number of votes cast for each candidate that adaptively shrink to zero width. These confidence sequences have uniform coverage from the beginning of an audit to the point of an exhaustive recount, but their main advantage is that their error guarantee is immune to continuous monitoring and early stopping, providing valid inference at any auditor-chosen, data-dependent stopping time. We develop VACSINEs for various types of elections including plurality, approval, ranked-choice, and score voting protocols.

Please Note: This talk will be given through Zoom. To join, please follow this link: Department Seminar by Aaditya Ramdas.

Wednesday, August 12, 2020 4:00 pm - 4:00 pm EDT (GMT -04:00)

Student Seminar by Rui Qiao, PhD in Statistics

A statistician's introduction to proteomics


Proteomics is the large-scale study of proteins. It has important applications in drug discovery and antibody sequencing. In this talk, I would like to explain the basic concepts and data formats in proteomics. I will introduce the commonly used workflows to generate statistically analyzable data from the raw data stored on public repositories. And, I want to sQiaoare with you several important research topics in proteomics where I think statisticians could make a huge contribution.

Please Note: This talk will be given through Microsoft Teams. To join, please follow this link: Virtual Seminar by Rui Qiao.

Wednesday, August 5, 2020 4:00 pm - 4:00 pm EDT (GMT -04:00)

Student Seminar by Carlos Araiza Iturria, PhD in Actuarial Science

Discrimination-aware decisions in finance and insurance


We discuss the implications of considering protected attributes when individuals are paired with measures of risk. Two examples are analyzed, a credit scoring example using simulated data is given from the perspective of the regulator and an insurance pricing scenario is analyzed in view of the underlying causal model. 

Please Note: This talk will be given online through Microsoft Teams. To join, please follow this link: Virtual Seminar by Carlos Araiza Iturria.

Wednesday, July 29, 2020 4:00 pm - 4:00 pm EDT (GMT -04:00)

Student Seminar by Chris Salahub, PhD in Statistics

A statistician's introduction to genomics


A classical model of genetic association is introduced alongside a short history of its development with a particular focus on mouse models. The inferential consequences of the widespread use of mouse models are discussed, and the modern application of this model is introduced as a problem of measuring pairwise associations in a large data set. A broad algebraic framework for this model and others like it is used to demonstrate several results and suggest future avenues of investigation.

Thursday, July 23, 2020 4:00 pm - 4:00 pm EDT (GMT -04:00)

Department Seminar by Kevin (Haosui) Duanmu, UC Berkeley

Applications of Nonstandard Analysis to Markov Processes


Nonstandard analysis, a powerful machinery derived from mathematical logic, has had many applications in probability theory as well as stochastic processes. Nonstandard analysis allows construction of a single object---a hyperfinite probability space---which satisfies all the first order logical properties of a finite probability space, but which can be simultaneously viewed as a measure-theoretical probability space via the Loeb construction. As a consequence, the hyperfinite/measure duality has proven to be particularly in porting discrete results into their continuous settings. 

In this talk, for every general-state-space discrete-time Markov process satisfying appropriate conditions, we construct a hyperfinite Markov process which has all the basic order logical properties of a finite Markov process to represent it.  We show that the mixing time and the hitting time agree with each other up to some multiplicative constants for discrete-time general-state-space reversible Markov processes satisfying certain condition. Finally, we show that our result is applicable to a large class of Gibbs samplers and Metropolis-Hasting algorithms.

Please note: This seminar will be delivered online through Webex. To join, please follow this link: Virtual seminar by Kevin (Haosui) Duanmu.

Thursday, July 16, 2020 5:00 pm - 5:00 pm EDT (GMT -04:00)

Department seminar by Nan Zou, Macquarie University

Multivariate Extremes: Block-Maxima vs Peak-Over-Threshold” 


Extreme value theory is concerned with describing the tail behaviour of univariate and multivariate distributions. In the estimation of the dependence structure of the extremes of multiple time series, the block maxima method and the peaks-over-threshold method are frequently applied. In this talk, I will compare these methods and propose some new methodologies. This is joint work with A. Bücher and S. Volgushev.

Nan is a lecturer in the Department of Mathematics and Statistics at Macquarie University in Sydney, Australia.

Please note: This seminar will be delivered via Zoom. Please check back later for the link. 

*This seminar will start at 5:00 p.m.