MASc Oral Exam | A Reinforcement Learning Framework for Simultaneous Chemical Process Flowsheet Generation, Design and Control

Tuesday, August 27, 2024 10:00 am - 11:00 am EDT (GMT -04:00)

You are welcome to attend Simone Reynoso Donzelli's MASc oral exam, where they will discuss their research in A Reinforcement Learning Framework for Simultaneous Chemical Process Flowsheet Generation, Design and Control

The exam will be virtual, and can be found here: https://teams.microsoft.com/l/meetup-join/19%3ameeting_MWRhMjMyZjQtYTkwNS00ODNjLTgwMDUtNzlhOTFiYjQyYjJm%40thread.v2/0?context=%7b%22Tid%22%3a%22723a5a87-f39a-4a22-9247-3fc240c01396%22%2c%22Oid%22%3a%22747fc62d-9571-4687-8879-375e1557eeea%22%7d

Abstract:

Integration of process design and control of chemical process flowsheets
(CPFs) is a key focus in chemical engineering, receiving extensive research
attention. The main objective in this area is to identify optimal process
design and control variables for a CPF, ensuring both economic viability and
dynamic feasibility of plant operations. This integration presents a complex
optimization problem, which is challenging to solve using traditional
optimization methods. Additionally, the problem becomes even more intricate
when discrete decisions or logical constraints, which give rise to Boolean
variables, are considered—common in integrated design and control of CPFs
problems. Therefore, the development of new methodologies is needed to
effectively address these challenges. The emerging trend in Machine Learning
(ML), particularly in Reinforcement Learning (RL), for solving such problems
serves as the foundation for this thesis. The limited studies regarding the
solution of the integrated problem using RL techniques motivates the
exploration and development of novel methodologies. Before addressing the
integrated problem, it is important to understand the potential of RL as a
tool for solving design and optimization problems of CPFs under steady-state
conditions. A RL methodology that introduces two novel RL agents: a discrete
masked Proximal Policy Optimization (mPPO) and a hybrid masked Proximal
Policy Optimization (mHPPO) has been proposed. In this framework, the agents
are capable of autonomously generate, design and optimize CPFs utilizing an
inlet flowrate and a set of unit operations (UOs) as initial information. A
key feature of this approach is the use of masking – an underexplored yet
promising area for solving the present problem – which involves the
incorporation of expert knowledge or design rules to exclude certain actions
from the agent's decision-making process, enhancing the agent’s
performance. Adding to that, this method stands out by seamlessly integrating
masked agents with rigorous models of UOs, including advanced thermodynamic
and conservation equations, within its simulation environment. The
effectiveness of these agents was evaluated through several case studies,
including two that utilized commercial simulation suites as part of the RL
environment. The resulting CPFs generated by the RL agents present viable
flowsheet designs that meet the pre-specified design requirements.
Recognizing the potential of RL for designing CPF, this thesis also
introduces a novel RL approach for generating, designing, and controlling
CPFs. Similar to the previous methodology, the proposed framework generates
CPFs directly from an inlet stream, eliminating the need for predefined
arrangements of UOs. Furthermore, the framework leverages surrogate models,
specifically Neural Networks (NNs), to accelerate the learning process of the
RL agent and avoid dependence on mechanistic dynamic models. These surrogate
models approximate key process variables and closed-loop performance metrics
for complex dynamic UO models. The results obtained using this methodology
were compared with model-based optimization results to assess the accuracy
and validity of the proposed approach in approximating well-established
methodologies. Consistency with the model-based approach was assessed.
Additional case studies involved formulations with multiple UOs to further
demonstrate the approach’s flexibility to deal with various scenarios.
Results from those case studies demonstrate that the RL agent can effectively
learn to maintain the dynamic operability of the UOs under disturbances,
adhere to equipment design and operational constraints, and generate viable
and economically attractive CPFs. The high adaptability offered by the
surrogate models enables this methodology to approximate the dynamic behavior
of the most common UO. As a result, the proposed framework is sufficiently
explicit and flexible to be used in more intricate design and control
problems involving multiple UOs.

Supervisor: Professor Sandoval