Candidate: Mohamed Sadok Gastli
Title: Deep Learning Tools for Yield and Price Forecasting Using Satellite Images
Date: June 30, 2021
Time: 2:00 PM
Supervisor(s): Karray, Fakhri
The ability to forecast crop yields and prices is vital to secure global food availability and provide farmers, retailers, and consumers with valuable information to maximize e
ness. Conventional approaches used to tackle this often use localized methods that are expensive and limited in generalizability. To tackle some of these known issues and to benefit from recently developed advanced tools of machine learning, this thesis explores the use of deep learning models as well as satellite images to forecast various crop yields and prices across the USA. The special case of the USA was chosen given the abundance of datasets pertaining to weather and agricultural information. Moreover, the thesis explores Transfer Learning (TL) and incremental learning applications in the field for generaliz- ability. In addition, a web application along with a user-friendly interface are designed and implemented to facilitate the ease of user application of the proposed models and approaches.
Multiple machine learning models, specifically those based on artificial neural networks, are deployed and tested, along with several voting regressor ensembles. The models are tested using satellite images for California and the Midwest in USA to predict soybean yield and forecast strawberry and raspberry yield and price. Dimensionality reduction is applied by converting those satellite images into histograms that represent the pixel value frequency count. To gauge the performance of the deployed models, several evalua- tions metrics are used including Mean Absolute Error (MAE), Root Mean-Squared Error (RMSE), R-Squared Coe cient (R2), as well as Aggregated Measure (AGM) and their Average Aggregated Measure (AAGM).
The potential of using deep learning based models in real-life applications which pro- vides crucial insight for all stakeholders in the field of agriculture is demonstrated in this work. The deployed multi-module based models and voting regressors ensembles proved to have higher performance compared to the single module models. The proposed CNN- LSTM is found to outperform Convolutional Neural Network (CNN) models proposed in the literature by an average RMSE percentage improvement of 31% while the inclusion of the satellite images of surface and subsurface moisture levels enhances the prediction performance. In addition, it is observed that all deployed models consistently lose forecast- ing performance the further they forecast in the future, with the CNN-LSTM Ensemble outperforming each of its components as well as the LSTM in yield forecasting while the CNN-LSTM outperforms the LSTM in price forecasting. Moreover, the proposed CNN- LSTM-SAE Ensemble outperforms the deployed CNN-LSTM, VAE, and SAE models in- cluding the literature CNN model by 70% AGM improvement for yield forecasting and 66% for price forecasting. The deployment of incremental learning with the CNN-LSTM Ensemble for yield forecasting without drastic loss in performance is achieved. Finally, based on the AGM metric, it is found that the TL CNN-LSTM outperforms the non-TL CNN-LSTM model by almost 28% AGM with reduction of 49% in computational time.
For future work, there is potential in expanding the utilized datasets and models to verify and improve the obtained results as well as investigating the performance on additional fresh produce and counties to better gauge and enhance the e ectiveness of the models and application.