Events - December 2019

Thursday, December 19, 2019 — 10:00 AM EST

Uncover Hidden Fine-Grained Scientific Information: Structured Latent Attribute Models


In modern psychological and biomedical research with diagnostic purposes, scientists often formulate the key task as inferring the fine-grained latent information under structural constraints. These structural constraints usually come from the domain experts’ prior knowledge or insight. The emerging family of Structured Latent Attribute Models (SLAMs) accommodate these modeling needs and have received substantial attention in psychology, education, and epidemiology.  SLAMs bring exciting opportunities and unique challenges. In particular, with high-dimensional discrete latent attributes and structural constraints encoded by a design matrix, one needs to balance the gain in the model’s explanatory power and interpretability, against the difficulty of understanding and handling the complex model structure.

In the first part of this talk, I present identifiability results that advance the theoretical knowledge of how the design matrix influences the estimability of SLAMs. The new identifiability conditions guide real-world practices of designing diagnostic tests and also lay the foundation for drawing valid statistical conclusions. In the second part, I introduce a statistically consistent penalized likelihood approach to selecting significant latent patterns in the population. I also propose a scalable computational method. These developments explore an exponentially large model space involving many discrete latent variables, and they address the estimation and computation challenges of high-dimensional SLAMs arising from large-scale scientific measurements. The application of the proposed methodology to the data from an international educational assessment reveals meaningful knowledge structure of the student population.

Monday, December 16, 2019 — 10:00 AM EST

The Blessings of Multiple Causes


Causal inference from observational data is a vital problem, but it comes with strong assumptions. Most methods assume that we observe all confounders, variables that affect both the causal variables and the outcome variables. But whether we have observed all confounders is a famously untestable assumption. We describe the deconfounder, a way to do causal inference from observational data allowing for unobserved confounding.

How does the deconfounder work? The deconfounder is designed for problems of multiple causal inferences: scientific studies that involve many causes whose effects are simultaneously of interest. The deconfounder uses the correlation among causes as evidence for unobserved confounders, combining unsupervised machine learning and predictive model checking to perform causal inference. We study the theoretical requirements for the deconfounder to provide unbiased causal estimates, along with its limitations and tradeoffs. We demonstrate the deconfounder on real-world data and simulation studies.

Thursday, December 5, 2019 — 4:00 PM EST

Having impact with data science in industry … Data science as an integrated set of skills in data analytics, data engineering and data entrepreneurship


Industry is going through rapid and profound changes, and the possibilities created by data science are one of the phenomena driving them. But data science is more than analytics and machine learning, and students need a T-shaped package of skills to have a successful career.

This talk sketches the role of data science in a rapidly changing industry. We discuss applications of data science in different innovation horizons, from improvement of current processes and product lines, to new business models and disruptive innovation driven by data. Having impact with data science in large, complex organizations is a challenge. It requires a blend of skills in analytics and machine learning, knowledge of computer science and IT infrastructures, and expertise in entrepreneurial project management. The talk presents a framework for organizing data science in CRISP-DM projects, and the various roles of data analysts, data engineers, domain experts, and executives. I also present the teaching philosophy of the Jheronimus Academy of Data Science in The Netherlands, where we design teaching programs around the three pillars of data analytics, engineering and entrepreneurship, and where programs are delivered in close collaboration with six application domains. Finally, I share some personal observations on the role of statistical thinking in the computer-science dominated world of data science.

S M T W T F S
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
1
2
3
4
  1. 2020 (37)
    1. June (1)
    2. May (4)
    3. April (5)
    4. March (9)
    5. February (4)
    6. January (14)
  2. 2019 (65)
    1. December (3)
    2. November (8)
    3. October (8)
    4. September (4)
    5. August (2)
    6. July (2)
    7. June (2)
    8. May (6)
    9. April (7)
    10. March (6)
    11. February (4)
    12. January (13)
  3. 2018 (44)
  4. 2017 (55)
  5. 2016 (44)
  6. 2015 (38)
  7. 2014 (44)
  8. 2013 (46)
  9. 2012 (44)