(226e) Optimal Artificial Neural Network Architecture Synthesis and Input Selection
AIChE Annual Meeting
2021
2021 Annual Meeting
Computing and Systems Technology Division
Advances in Computational Methods and Numerical Analysis I
Tuesday, November 9, 2021 - 10:13am to 10:32am
In this study, a mixed integer nonlinear programming (MINLP) formulation is proposed in order to simultaneously design and train a feedforward ANN in an optimal way by: i) detecting the ideal number of neurons, ii) synthesizing the optimal information flow between those neurons, inputs and outputs, iii) minimizing the covariance of the estimated parameters to ensure identifiable ANN architecture with eliminated overfitting and iv) selecting optimum input variables in a multi variable data set.
Unlike other approaches where hyper parameters and architecture are fixed, the hyper parameters are assigned as additional decision variables to take the existence/non-existence of neurons, inputs and their connections into account [3]. Moreover, this formulation minimizes both training error and parameter covariance. Resulting MINLP problems are solved using a quasi-decomposition algorithm, composed of an outer integer programming problem and inner nonlinear programming problem. Results show that suggested approach yields an optimal ANN architecture with tighter prediction bounds and higher prediction accuracy in test performance compared to the traditional fully connected methods.
Keywords: Artificial Neural Networks; Error Propagation; Mixed Integer Nonlinear Programming; Optimal Input Selection; Parameter Uncertainty
References:
1) Dua, V. (2010). A mixed-integer programming approach for optimal configuration of artificial neural networks. Chemical Engineering Research and Design, 88(1), 55-60.
2) Dua, V. (2006). Optimal configuration of artificial neural networks. In Computer Aided Chemical Engineering (Vol. 21, pp. 1599-1604). Elsevier.
3) Sildir, H., Aydin, E., & Kavzoglu, T. (2020). Design of feedforward neural networks in the classification of hyperspectral imagery using superstructural optimization. Remote Sensing, 12(6), 956.