(344f) A Deep Reinforcement Learning Approach for Production Scheduling

Conference

AIChE Annual Meeting

Year

2019

Proceeding

2019 AIChE Annual Meeting

Group

Computing and Systems Technology Division

Session

Machine Learning Applications and Intelligent Systems

Time

Tuesday, November 12, 2019 - 2:05pm to 2:24pm

Authors

Hubbs, C. D. - Presenter, The Dow Chemical Company

Sahinidis, N. V., Carnegie Mellon University

Grossmann, I., Carnegie Mellon University

Wassick, J., The Dow Chemical Company

Scheduling in the chemical industry is a difficult task beset by uncertainties from maintenance issues, to
demand uncertainty, to pricing changes, among others. For many supply chain scheduling problems, decisions
need to be made in real-time when the situation in the plant changes, leading industrial operations to
employ human schedulers in the decision making process who react to pressures in the organization, often
rescheduling and creating suboptimal schedules. Shobrys and White [2002] estimate that â€œgoodâ€ decisions
for these scheduling problems can increase the profit margin by at least $10/ton of product. Given the
thousands of tons produced by modern, industrial, chemical operations each day, there is a large financial
incentive to improve scheduling processes and decision making under uncertainty.

Although there is a long history of optimization under uncertainty, many techniques are difficult to
implement due to high computational costs, sources of uncertainty (endogenous vs exogenous), and their
measurement [Grossmann et al., 2016]. The stochastic optimization approach deals with uncertainty in
stages whereby a decision is made, then the uncertainty is revealed which enables a recourse decision to
be made given the new information. For scheduling applications, Jung et al. [2004] develops a multi-stage
stochastic optimization model to determine safety stock levels to maintain a given customer satisfaction
level with stochastic demand. Sand and Engell [2004] developed a two-stage stochastic mixed-integer linear
program to address the scheduling of a chemical batch process with a rolling horizon while accounting for the
risk associated with their decisions. Englberger et al. [2016] develop and implement a two-stage stochastic
optimization problem on a rolling horizon for integrated master production scheduling that reduced delays
at the expense of higher safety stock.

In this paper we explore a new approach to scheduling using deep reinforcement learning (DRL) to
train an agent to schedule a multi-product reactor under uncertainty and meet service level and profitability
targets for the supply chain organization motivated by a real-world example. The deep reinforcement learning
model recasts the scheduling problem as a Markov Decision Process (MDP) where the state is defined as the
demand, forecast, inventory levels, current production, schedule, and time, and the decisions are related to
products to be scheduled at given time intervals. This redefinition as a MDP enables a natural representation
of uncertainty for the model. While training must take place off-line using a simulation, once trained, the
model can then be deployed on-line in a production system to yield real-time schedules as new orders are
entered into the enterprise resource planning (ERP) system and as the situation in the plant changes due to
delays and unplanned events.

A drawback of DRL is the lack of theoretical guarantees of performance. To address this, we benchmark
the trained DRL agent against stochastic mixed-integer linear programs (MILP) and deterministic models to
understand how well the DRL solutions measure up. All models are validated using Monte Carlo simulations,
which show that DRL performs quite well in comparison to MILP approaches. In addition, we explore
integration of MILPâ€™s to leverage the optimality guarantees of MILPâ€™s with the efficient time-to-solution of
DRL by training the agent to make optimal decisions in a supervised setting before allowing the agent to
train in reinforced mode.

References

J. Englberger, F. Herrmann, and M. Manitz. Two-stage stochastic master production scheduling under
demand uncertainty in a rolling planning environment. International Journal of Production Research, 54
(20):6192â€“6215, 2016. ISSN 1366588X. doi: 10.1080/00207543.2016.1162917.

I. E. Grossmann, R. M. Apap, B. A. Calfa, P. GarcÄ±Ìa-Herreros, and Q. Zhang. Recent advances in math-
ematical programming techniques for the optimization of process systems under uncertainty. Computers
and Chemical Engineering, 91:3â€“14, 2016. ISSN 00981354. doi: 10.1016/j.compchemeng.2016.03.002. URL
http://dx.doi.org/10.1016/j.compchemeng.2016.03.002.

J. Y. Jung, G. Blau, J. F. Pekny, G. V. Reklaitis, and D. Eversdyk. A simulation based optimization
approach to supply chain management under demand uncertainty. Computers and Chemical Engineering,
28(10):2087â€“2106, 2004. ISSN 00981354. doi: 10.1016/j.compchemeng.2004.06.006.

G. Sand and S. Engell. Modeling and solving real-time scheduling problems by stochastic integer pro-
gramming. Computers and Chemical Engineering, 28(6-7):1087â€“1103, 2004. ISSN 00981354. doi:
10.1016/j.compchemeng.2003.09.009.

D. E. Shobrys and D. C. White. Planning, scheduling and control systems: Why cannot they work together.
Computers and Chemical Engineering, 26(2):149â€“160, 2002. ISSN 00981354. doi: 10.1016/S0098-1354(01)
00737-2.

Topics

Computing and Systems Engineering

Plant Operations

Supply and Demand

Other Sites & Tools

Technical Groups

Technical

Professional/Personal Growth

Societal Needs

Leadership

2025 Spring Meeting and 21st Global Congress on Process Safety

2025 AIChE Annual Meeting

Upcoming Conferences & Events

CEP: January 2025

CEP: December 2024

Explore Areas of Advancement:

Learning Center:

Want to be an Entrepreneur? Personal Stories From Three Successful Entrepreneurs Who Have Traveled This Path.

(344f) A Deep Reinforcement Learning Approach for Production Scheduling

AIChE Annual Meeting

2019

2019 AIChE Annual Meeting

Computing and Systems Technology Division

Machine Learning Applications and Intelligent Systems

Tuesday, November 12, 2019 - 2:05pm to 2:24pm

Authors

Topics

More Conference Links

Cancelation Policy

Code of Conduct

Beware of Hotel and Attendee-list Scams