(226e) Optimal Artificial Neural Network Architecture Synthesis and Input Selection | AIChE

(226e) Optimal Artificial Neural Network Architecture Synthesis and Input Selection

Authors 

Sildir, H. - Presenter, Gebze Institute of Technology
Aydin, E., Koç University
Artificial neural networks (ANNs) have been obtaining increasing attention over the past decades both due to enhanced computing power and data availability. Classical and feedforward ANNs propagate the input information to the succeeding layers through linear and nonlinear operations, incorporating the elements of input vector with different weights. Traditionally, all elements of an ANN are combined in a fully connected sense. On the other hand,performance of ANNs is highly influenced by the selection of architecture and input variables, defined by hyper parameters such as the number of neurons in the hidden layer and connections between the network variables and inputs [1,2].

In this study, a mixed integer nonlinear programming (MINLP) formulation is proposed in order to simultaneously design and train a feedforward ANN in an optimal way by: i) detecting the ideal number of neurons, ii) synthesizing the optimal information flow between those neurons, inputs and outputs, iii) minimizing the covariance of the estimated parameters to ensure identifiable ANN architecture with eliminated overfitting and iv) selecting optimum input variables in a multi variable data set.

Unlike other approaches where hyper parameters and architecture are fixed, the hyper parameters are assigned as additional decision variables to take the existence/non-existence of neurons, inputs and their connections into account [3]. Moreover, this formulation minimizes both training error and parameter covariance. Resulting MINLP problems are solved using a quasi-decomposition algorithm, composed of an outer integer programming problem and inner nonlinear programming problem. Results show that suggested approach yields an optimal ANN architecture with tighter prediction bounds and higher prediction accuracy in test performance compared to the traditional fully connected methods.

Keywords: Artificial Neural Networks; Error Propagation; Mixed Integer Nonlinear Programming; Optimal Input Selection; Parameter Uncertainty

References:

1) Dua, V. (2010). A mixed-integer programming approach for optimal configuration of artificial neural networks. Chemical Engineering Research and Design, 88(1), 55-60.

2) Dua, V. (2006). Optimal configuration of artificial neural networks. In Computer Aided Chemical Engineering (Vol. 21, pp. 1599-1604). Elsevier.

3) Sildir, H., Aydin, E., & Kavzoglu, T. (2020). Design of feedforward neural networks in the classification of hyperspectral imagery using superstructural optimization. Remote Sensing, 12(6), 956.