Research interests

Youth and substance use

Youth is an essential period of development and transition when risky behaviours usually occur.

One of the new risks youths may experience is substance use. This research aims to identify risk profile phenotypes in youth substance use and explore the dynamic transitions in use patterns across time.

Both unsupervised machine learning (ML) methods and a hierarchical statistical modelling approach are employed in this research, taking advantage of an ongoing health survey from a sample of Canadian secondary school students and their attending institutions.

Supervisors: Dr. Zahid Butt, Dr. Helen Chen, Dr. Plinio Morita

Postdoc: Yang (Rena) Yang

Relevant Publications:

  • Yang Yang, Zahid A. Butt, Scott T. Leatherdale, Plinio P. Morita, Alexander Wong, Helen H. Chen. Phenotyping Risk Profiles of Polysubstance Use Among Canadian Youth via Fuzzy Clustering. Oral presentation given at the 7th Annual Conference on Vision and Intelligent Systems (CVIS 2021), University of Waterloo.
  • Yang Yang, Helen H. Chen, Zahid A. Butt, Scott T. Leatherdale, Plinio P. Morita, Alexander Wong. Exploring the Dynamic Transitions in Use Patterns of Youth Substance Use: A Latent Variable Modelling Approach Using the COMPASS Data. Oral presentation given at the Canadian Public Health Association (CPHA 2021)

Infodemics

“We’re not just fighting an epidemic; we’re fighting an infodemic,” ~ Tedros Adhanom Ghebreyesus, WHO Director-General.

False information about the COVID-19 disease emerged as early as it was discovered and has gone viral throughout the world virtually via social media and the Internet. From an early conspiracy linking COVID-19 with 5G infrastructure to the current COVID-19 vaccine hesitancy, infodemic has significantly compromised public health worldwide.

Our research team focuses on addressing the critical issue of infodemic associated with infectious diseases, especially for the current COVID-19 pandemic. We have already published a scoping review that explored what topics and the sentiment the general public has in mind from social media. We’re currently developing a conceptual framework to explore how social media infodemic influences individual decisions and behaviours.

Our future plans include test and revising our conceptual framework and further addressing how to manage infodemic by infosurveillance. Another upcoming topic of interest will focus on how infodemic has disproportionately influenced marginalized or vulnerable populations.

Supervisors: Dr. Helen Chen, Dr. Zahid Butt, Dr. Samantha Meyer

Student Member: Shu-Feng Tsao

Relevant Publications:

Infoveillance

Emerging infectious diseases (EIDs) such as SARS, Ebola virus and most recently, COVID-19 are a significant threat to health globally. These epidemics result in increased usage and media consumption by the general public for information and expressing their thoughts and ideas.

Analyzing people's internet search behaviour regarding health-related information and social media use can guide real-time surveillance of emerging diseases and help predict epidemics. Such a system would act as an early warning system and help public health authorities and hospitals to plan and respond to EID threats.

The ultimate goal is to create a real-time surveillance system that would act as an early warning system to predict EIDs and inform public health authorities and hospitals.

Supervisors: Dr. Zahid Butt, Dr. Helen Chen

Postdoc: Yang (Rena) Yang

Student Member: Shu-Feng Tsao

Health synthetic data

It has remained challenging and resource-intense to access electronic medical records, given its privacy, security, and other concerns. Health synthetic data have been one of the solutions to this issue, and synthetic data can be used to accelerate research and technology.

Unlike the United Kingdom and the United States, Canada still lacks synthetic health datasets that meet findable, accessible, interoperable, and reusable (FAIR) principles. Furthermore, an advanced machine learning technique—federated learning—has been used to protect data privacy since it does not require exchanging real data across nodes when training models. However, federated learning has not been optimized in health research, and has rarely been conducted with health synthetic data.

Therefore, this project aims to create FAIR research data management for health synthetic data along with federated learning. This project will use chimeric antigen receptor T cells (CAR-T) therapy data as a case study with the goals to tackle the following 3 research questions:

  • How to generate and curate useful FAIR synthetic CAR-T data at cohort level to represent complex patient population, while preserving privacy in synthetic data and training federated learning?
  • How to address and manage synthetic data and metadata needs and issues related to data ingestion, transformation, and preservation in the RDM workflow and process?
  • What policy changes are needed for data governance of sharing FAIR health synthetic data in pan-Canada networks?

Supervisors: Dr. Helen Chen, Dr. Zahid Butt, Dr. Plinio Morita, Dr. Catherine Burns, Dr. William Wong

Student Members: Shu-Feng Tsao, Kam Sharm, Hateem Noor

*This project has been awarded a Compute Ontario grant.