Optimizing particle properties in cooling crystallization.

In this project, I explored whether a complex, time-dependent chemical process—the unseeded batch cooling crystallization—can be optimized using model-free deep reinforcement learning. The process was formulated as a Markov Decision Process (Fig. 1), and a Proximal Policy Optimization (PPO) agent was trained to learn its control policy directly from simulated process data.
The objective was to deliberately shape the particle size distribution and produce larger, more stable crystals. The RL agent autonomously developed a control strategy that clearly outperformed classical benchmark profiles:
This work demonstrates how modern reinforcement-learning methods can discover optimal process pathways in complex, nonlinear systems when a suitable simulation environment is available—opening new opportunities for data-driven process development and digital chemical engineering.