(599f) Parallel Cyclic Reduction Decomposition for Dynamic Optimization Problems | AIChE

(599f) Parallel Cyclic Reduction Decomposition for Dynamic Optimization Problems

Authors 

Eason, J. P. - Presenter, Carnegie Mellon University
Wan, W., Carnegie Mellon Univeristy
Biegler, L., Carnegie Mellon University
Nicholson, B., Sandia National Laboratories
Direct transcription of dynamic optimization problems, with differential-algebraic equations discretized and written as algebraic constraints, can create very large nonlinear optimization problems. When this discretized optimization problem is solved with an NLP solver, such as IPOPT, the dominant computational cost often lies in solving the linear system that generates Newton steps for the KKT system. Computational cost and memory constraints for this linear system solution raise many challenges as the system size increases. On the other hand, the linear KKT system for our dynamic optimization problem is sparse and structured, and can be permuted to form a block tridiagonal matrix. This study explores a parallel decomposition strategy for block tridiagonal systems that is based on cyclic reduction (CR) factorization of the KKT matrix. The classical CR method has good observed performance, but its numerical stability properties are unproved for our KKT system. Alternately, Yalamov and Pavlov proposed a modified, full-space CR algorithm with proven stability bounds. Here we compare the performance of both methods on random test matrices. Stable behavior is observed in both cases, and the classical CR method has lower computational cost. This result is explained by analyzing the complexity of both methods. Finally, we discuss modifications to the CR decomposition that improve performance, and we apply the approach to four industrially relevant case studies. These include a moving horizon estimation of a CSTR, nonlinear model predictive control of a CSTR, grade transition for a polymerization process, and nonlinear model predictive control of a bubbling fluidized bed reactor. The latter problem is discretized in both space and time, resulting in up to 250000 variables and constraints. On this problem, a parallel speedup of a factor of four is observed when using eight processors.