Joel Dubin


Joel Dubin
Contact Information:
Joel Dubin

​Health Data Science Lab (HDSL) Lead:

HDSL Website 

Research interests

My primary research interest is in the area of methodological development in longitudinal data analysis, including for multivariate longitudinal data, where more than one outcome, (e.g., systolic and diastolic blood pressure) are each followed for individuals over time. Methods pursued for this type of data include the correlation of different longitudinal outcomes over time using curve-based methods, and incorporating lags and derivatives of the curves. I am also interested in change point and latent response models for longitudinal data, as well as prediction models, including the consideration of similarity to improve prediction accuracy.

I work in a variety of application areas, including nephrology, cancer, smoking cessation, intensive care, electronic health records, nutrition, aging, and environmental issues.


After completing my masters degree in Applied Statistics at Villanova University in 1993, I worked at Veteran Affairs Health Services and Research in Houston, Texas; and at the University of Texas M.D. Anderson Cancer Center, also in Houston.

I then went on to receive my PhD in Statistics from the University of California at Davis in 2000, after which I worked as an assistant professor at the Yale University Division of Biostatistics, now the Department of Biostatistics, forging several collaborations with researchers in public health and medicine.

I arrived as an associate professor at the University of Waterloo in 2005, with a joint appointment in the Department of Statistics and Actuarial Science, and the Department of Health Studies and Gerontology, the latter which is now the School of Public Health Sciences.

Selected publications

  • Yeh C-K, Rice G, Dubin JA.  Functional spherical autocorrelation: a robust estimate of the autocorrelation of a functional time series.  Electronic Journal of Statistics, 17: 650-687, 2023.
  • Battista K, Diao L, Patte KA, Dubin JA, Leatherdale ST. Examining the use of decision trees in population health surveillance research: An application to youth mental health survey data in the COMPASS study. Health Promotion and Chronic Disease Prevention in Canada, 43(2): 73-86, 2023.
  • Yeh C-K, Rice G, Dubin JA.  Evaluating real-time probabilistic forecasts with application to National Basketball Association outcome prediction. The American Statistician, 76(3): 214-223, 2022.

  • Battista K, Patte KA, Diao L, Dubin JA, Leatherdale ST. Using decision trees to examine environmental and behavioural factors associated with youth anxiety, depression, and flourishing. International Journal of Environmental Research and Public Health, 19(17): 10873, 2022. 

  • Feng S, Dubin JA.  Identifying early-measured variables associated with APACHE IVa providing incorrect in-hospital mortality predictions for critical care patients.  Scientific Reports, 11:22203, 2021.

  • Sahu KS, Majowicz S, Dubin JA, Morita PP. NextGen public health surveillance and the Internet of Things (IoT).  Frontiers in Public Health, 9:756675, 2021.

  • Sharafoddini A, Dubin JA, Lee J. Identifying subpopulations of septic patients: A temporal data-driven approach. Computers in Biology and Medicine, 2021. 130:104182.

  • Pirrie M, Carson V, Dubin JA, Leatherdale ST. School-level factors within Comprehensive School Health associated with the trajectory of moderate-to-vigorous physical activity over time: A longitudinal, multilevel analysis in a large sample of Grade 9 and 10 students in Canada. International Journal of Environmental Research and Public Health, 18(23): 12761, 2021.

  • Ji K, Dubin JA.  A semiparametric stochastic mixed effects model for bivariate cyclic longitudinal data.  Canadian Journal of Statistics, 48(3): 471-498, 2020. 

  • Gohari MR, Cook RJ, Dubin JA, Leatherdale ST. Identifying patterns of alcohol use among secondary school students in Canada: a multilevel latent class analysis. Addictive Behaviors, 100: 106120, 2020.

  • Xu Y, Lee J, Dubin JA. Similarity-based random survival forest. arXiv:1903.01029; Computation (stat.CO), 2019

  • Hack EE, Dubin JA, Fernandes M, Costa SC, Tyas SL. Multilingualism and dementia risk: Longitudinal analysis of the Nun Study. Journal of Alzheimer’s Disease, 71(1): 201-212 2019. DOI: 10.3233/JAD-181302

  • Gohari MR, Dubin JA, Cook RJ, Leatherdale ST. Identifying trajectories of alcohol use among a  sample of secondary-school students in Ontario and Alberta: longitudinal evidence from the COMPASS study. Health Promotion and Chronic Disease Prevention in Canada: Research, Policy and Practice, 39(8-9): 244-253, 2019. DOI: 10.24095/hpcdp.39.8/9.02 

  • Yang Y, Hirdes JP, Dubin JA, Lee J.  Fall risk classification in community-dwelling older adults using a smart wrist-worn device and the Resident Assessment Instrument-Home Care (RAI-HC): Prospective observational study.  Journal of Medical Internet Research (JMIR) - Aging, 2(1), e12153, 2019.

  • Sharafoddini A, Dubin JA, Maslove DM, Lee J.  A new insight into missing data in intensive care unit patient profiles: observational study JMIR (Journal of Medical Internet Research) – Medical Informatics, 7(1): e11605, 2019.  DOI: 10.2196/11605

  • Waudby-Smith I, Tran N, Dubin JA, Lee J.  Sentiment in nursing notes as an indicator of patient mortality after admission to an intensive care unit.  PLoS ONE, 13(6): e0198687, 2018. 

  • Laxer RE, Cooke M, Dubin JA, Brownson RC, Chaurasia A, Leatherdale ST.  Behavioural patterns only predict concurrent BMI status and not BMI trajectories in a sample of youth in Ontario, Canada.  PLoS ONE, 13(1): e0190405, 2018. 

  • Sharafoddini A, Dubin JA, Lee J.  Patient similarity in prediction models based on health data: a scoping review. JMIR - Medical Informatics, 5(1): e7, 2017.

  • Laxer RE, Brownson RC, Dubin JA, Cooke M, Chaurasia A, Leatherdale ST.  Clustering of risk-related modifiable behaviours and their association with overweight and obesity among a large sample of youth in the COMPASS study.  BMC Public Health, 17(102), 2017. 

  • Maslove DM, Dubin JA, Shrivats A, Lee J.  Errors, omissions, and outliers in hourly vital signs measurements in intensive care.  Clinical Care Medicine, 44(11): e1021-e1030, 2016. 

  • Raffa JD, Dubin JA.  Multivariate longitudinal data analysis with mixed effects hidden Markov models. Biometrics, 71(3): 821-831, 2015.
  • Lee J, Maslove DM, Dubin JA.  Personalized mortality prediction driven by electronic medical data and a patient similarity metric. PLOS ONE, 10(5): e0127428, 2015.
  • Leatherdale ST, Brown KS, Carson V, Childs RA, Dubin JA, Elliot SJ, Faulkner G, Hammond D, Manske S, Sabiston CM, Laxer RE, Bredin C, Thompson-Haile A.  The COMPASS study: a longitudinal hierarchical research platform for evaluating natural experiments related to changes in school-level programs, policies and built environment resources. BMC Public Health, 14: 331, 2014.
  • Khan, S.A., Chiu, G.S., Dubin, J.A. Therapeutic hypothermia: quantification of the transition of core body temperature using the flexible mixture bent-cable model for longitudinal data. Australian & New Zealand Journal of Statistics, 55: 369-385, 2013 (published in 2014).
  • Foebel A.D., Heckman G.A., Ji K., Dubin J.A., Turpie I.D., Hussack P., McElvie R.S. Heart failure-related mortality and hospitalization in the year following admission to a long-term care facility: The Geriatric Outcomes and Longitudinal Decline in Heart Failure (GOLD-HF) Study. Journal of Cardiac Failure, 19(7): 468-477, 2013.
  • Khan, S.A., Rana, M., Li, L., Dubin, J.A. A statistical investigation to monitor and understand atmospheric CFC decline with the spatial-longitudinal bent-cable model. International Journal of Statistics & Probability, 1(2): 56-68, 2012.
  • Xiong, X, Dubin, J.A.  A binning method for analyzing mixed longitudinal data measured at distinct time points. Statistics in Medicine, 29: 1919-1931, 2010.
  • Dubin J.A., O'Malley S.S.  Event charts for the analysis of adverse events in longitudinal studies: an example from a smoking cessation pharmacotherapy trial.  The Open Epidemiology Journal, 3: 34-41, 2010.
  • Khan, S.A., Chiu, G., Dubin, J.A.  Atmospheric concentration of chlorofluorocarbons: addressing the global concern with the longitudinal bent-cable model.  CHANCE, 22: 8-17, 2009.
  • Hall P.A., Dubin J.A., Crossley M., Holmqvist M., D'Arcy, C.   Does executive function explain the IQ-mortality association? Evidence from the Canadian Study on Health and Aging.  Psychosomatic Medicine, 71: 196-204, 2009.
  • Wainwright, P.E., Leatherdale, S.E., Dubin, J.A.  Advantages of mixed effects models over traditional ANOVA models in developmental studies: A worked example in a mouse model of fetal alcohol syndrome.  Developmental Psychobiology, 49: 664-674, 2007.
  • Dubin J.A., Han, L., Fried, T.R.  Triggered sampling could help improve longitudinal studies of persons with elevated mortality risk.  Journal of Clinical Epidemiology, 60: 288-293, 2007.
  • Dubin J.A., Müller H.G.  Dynamical correlation for multivariate longitudinal data.  Journal of the American Statistical Association, 100: 872-881, 2005.
  • Hardy, S.E., Dubin, J.A., Holford, T.R., Gill, T.M.  Transitions between states of disability and independence among older persons.  American Journal of Epidemiology, 161: 575-584, 2005.
  • Dubin J.A., Müller H.G., Wang J.L.  Event history graphs for censored survival data.  Statistics in Medicine, 20: 2951-2964, 2001.
  • Lee JJ, Hess KH, Dubin JA.  Extensions and applications of event charts.  American Statistician, 54(1): 63-70, 2000.

Prof. Dubin's Google Scholar profile

Software development

  • Dubin JA, Li M, Qiao D, Müller HG.  R package dynCorr to perform dynamical correlation analysis on multivariate longitudinal data, as described in Dubin and Müller (2005). On CRAN. Most recent version (1.1.0) updated in December 2017. Earlier updates (1.0.0) in June 2017, (0.1-2) in October 2012, and original completed (0.1-1) in February 2009.
  • Dubin JA, Müller HG, Wang, JL. R (and S-Plus) function event.history, which is contained in Frank Harrell’s Hmisc package. The function is based on the method from Dubin, Müller, and Wang (2001). Completed in 2001.
  • Lee JJ, Hess KH, Dubin JA.  R (and S-Plus) function event.chart, which is contained in Frank Harrell’s Hmisc package. The function is based on the method from Lee, Hess, and Dubin (2000), and Dubin, Lee, and Hess (1997). Completed in 1997, with updates in 2000 and 2008.