MASc Oral Exam |Machine Learning-Based Time Series Modelling with Applications for Forecasting Regional Wind Power and Air Quality Index, by Hanin Alkabbani

Hanin Alkabbani, supervised by Professor Elkamel will complete his MASc oral exam on August 17th.

Abstract:

Recently, time series forecasting has acquired considerable academic and industrial interests in various areas for different applications. Machine learning (ML) algorithms are known for their ability to capture the chaotic temporal non-linear relations in time series data. This research employs various ML concepts and algorithms into two different case studies of time series forecasting: 1-Regional wind power forecasting and 2-Air quality index

(AQI) forecasting.

The first case study is conducted to focus on regional wind power forecasting from different perspectives comprehensively. First, the meteorological and spatial parameters with seasonal and temporal features were filtered and selected by a proposed deep feature selection approach consisting of a series of steps. Later, multiple ML algorithms, including artificial neural network (ANN), deep neural network (DNN), long short-term memory (LSTM), bagging tree (BT), and support vector machine/regression (SVM/SVR), were used for training 1 step ahead forecasting models. Finally, an assessment of the constructed models was conducted based on different error criteria metrics. The final comparative discussion concluded that the SVR-based model provided accurate generalized performance when tested on unseen data and surpassed other models, including LSTM. However, when constructing the multi-step ahead forecasting models, the predictions obtained from the multi-input multi-output (MIMO) LSTM approach were reliable with higher accuracies. Overall, for multi-step forecasting, it was concluded that the performance of the MIMO multi-step strategy was superior to the direct multi-step forecasting method, especially by employing algorithms with recursive properties.

It is also essential to mention that chapter 2 of this thesis is a comprehensive literature review of machine learning and metaheuristics methodologies of renewable power forecasting. This review can guide scientists and engineers in analyzing and selecting the appropriate prediction approaches based on the different circumstances and applications.

The second case study focuses on missing time series data that is one major problem that commonly appears in environmental-related data. This case study tackled this issue by imputing missing entries using a random forest (RF)-based imputation technique known as miss-forest imputation. The effectiveness of this imputation method was examined by building the forecasting models of the AQI twice, using the miss-forest imputed data, and using linear imputed data. Results obtained showed that models trained using miss-forest imputed data could generalize AQI forecasting and predict the air index to categorize ambient air quality with an average accuracy of 92.41 %.

Support Waterloo Engineering