Master's Research Papers

2024

Integrating uniCATE and CATE: A Two-Step Approach for Predictive Biomarker Discovery

Author: Rangipour, Z. 
Supervisor: Dubin, J.

Through a comprehensive simulation study, this paper screens varying dimensions of biomarker data, subsequently evaluating the association of the screened biomarkers with a health response of interest. The study extends upon the work by Boileau et al. (A flexible approach for predictive biomarker discovery) by integrating their proposed two-step method, involving their novel biomarker discovery method, uniCATE, into our analysis framework. This involved filtering predictive biomarkers using a threshold criterion and then applying CATE to the remaining features. These extensions were strategically implemented with the primary aim of advancing biomarker discovery research within the domain of biostatistics. Across our study, we observed that the methods we used for biomarker identification performed more effectively in scenarios characterized by less complex Data Generating Processes (DGPs), such as the Opposite Symmetric Linear DGP. Simplified relationships between variables within these DGPs produced more consistent and reliable identification of predictive biomarkers. In scenarios where the number of biomarkers (p) was 500 and the sample size (n) was either 100 or 500, our study achieved more accurate identification of the most predictive biomarkers. Particularly, when n was 500, the models consistently selected the correct biomarkers more frequently than when n was 100. The study highlights the significance of methodological approaches in identifying biomarkers and demonstrates the effectiveness of incorporating uniCATE and Conditional Average Treatment Effect (CATE) methods across different simulated scenarios.


A spatio-temporal analysis of avian influenza H5N1 outbreaks, focusing on the impact of season and climate

Author: Bandara, P.
Supervisor: Dean, C.

The spread of H5N1 avian influenza virus in Canada poses significant challenges for public health and ecological stability. This study assesses the spatial and temporal dynamics of H5N1 outbreaks among wild birds and mammals in Canada between 2021 and 2023, with a focus on statistical modelling that helps in understanding the impact of climate and season on outbreaks. Employing poisson, negative binomial, logistic, zero-truncated, zero-inflated poisson, and zero-inflated negative binomial models, we identify several count models that best fit the data. The model selection process was guided by statistical criteria such as Akaike Information Criterion(AIC), likelihood ratios, and assessments of overdispersion. An application of the space–time permutation scan statistic, which relies solely on case data without requiring population-at-risk figures, facilitated the identification of high-risk areas. These areas were mapped using ArcGIS for enhanced geographical visualization. This analysis concluded that the zero-inflated negative binomial model provided a fair fit for the H5N1 case data, highlighting significant overdispersion and a higher prevalence of zero counts than expected in a poisson distribution. Seasonality was identified as a key influence, with varying incidence rates across different seasons. Correlations were observed between H5N1 case counts and human population density, as well as environmental variables such as temperature and precipitation. The study also pinpointed specific geographical and temporal clusters where the risks of H5N1 outbreaks were statistically higher. This study offers valuable statistical insights into the dynamics of H5N1 spread in Canada. The findings highlight relevant disease patterns, aiding in the formulation of targeted and effective disease control strategies to mitigate the impact on both human health and wildlife.


Computational Tools for the Simulation and Analysis of Spike Trains

Author: Afable, JV.
Supervisor: Marriott, P.

This paper presents a set of tools and a workflow for replicating and modifying a spiking neural network simulation of the olfactory bulb using NEURON. Key concepts in computational neuroscience are first reviewed, including spike trains, neuron models, network architectures, and the biological circuitry of the olfactory bulb. The process of replicating an existing olfactory bulb simulation study is then described in detail. Modifications to the model are explored, investigating the effects of changing the random seed and adjusting mitral-granule cell network connectivity. Results demonstrate consistent network behavior across seeds, but a strong dependence of mitral and granule cell spiking activity on connectivity between these populations. The computational workflow establishes a framework for replicating and extending published neural simulations.


Developments in Neural Simulators in Computational Neuroscience

Author: Ladtchenko, V.
Supervisor: Marriott, P.

In this paper we will look at how In Silico studies have allowed scientists to minimize invasive procedures such as those required in neuroscience research. We will discuss simulation as an alternative and look into the inner workings of a simplified version of a modern simulator. Then we will discuss the mathematical modeling commonly used in simulators. We will look at deterministic and stochastic models, which are two ways of modeling brain neural networks. Next, we will look at simulators in detail and discuss their advantages and disadvantages. We will in particular focus on NEURON because it is the most popular simulator, and then on BrainPy which is a recent development. Finally, we will perform experiments on the NEURON and BrainPy simulators and find that we can (1) run the same model in BrainPy for 10,000 neurons in under 1 second, which runs in 1 minute for 1,000 neurons in NEURON, signifying a 600 times speed-up. (2) we will also show that we can run a simulation with up to 50,000,000 neurons using BrainPy, which we cannot do in NEURON.


Unveiling pitfalls and exploring alternatives in the use of pilot studies for sample size estimation

Author: Ji, C.
Supervisor: Zhu, Y.

Pilot studies are used to estimate effect sizes, which in turn are used in power calculations to determine the sample size needed for the main study to achieve a prespecified power and significance level. In this paper we explore the pitfalls of using small pilot studies to perform these estimates. Additionally, we examine three alternatives to determine a sufficient sample size needed for the main study, the corridor of stability, which utilizes bootstrapping to determine a sample size at which the estimate of the effect size will become stable, as well as two Bayesian metrics, the average coverage criterion, and the average length criterion, which involve controlling statistics based on the posterior distribution of the effect size. All three of these metrics are more robust than current methods to determine sample sizes and effect sizes from small pilot studies. Both Bayesian metrics are unaffected by sample size, and hence may be able to bypass the need for pilot studies altogether.