(767b) Comprehensive Computer-Aided Molecular Design Framework for Pure Component Design | AIChE

(767b) Comprehensive Computer-Aided Molecular Design Framework for Pure Component Design



Computer-aided molecular design (CAMD) has become a powerful tool to design novel molecules to meet product target property criteria. Over the years, the field has matured from neighborhood-based searches to fathoming the chemical space using optimization and enumeration. The techniques used in this field have been rooted in the basic group contribution (GC) property methods and thus more generic design problems have either been decomposed or projected onto GC properties to make them compatible with CAMD methods. These techniques have also been plagued by the exponential search-space of molecular structures which inhibit a generic enumeration based design approach.

We have developed an optimization-based framework for basic molecular design aimed to automate, augment and accelerate the steps in CAMD. This framework employs integer optimization efficiently and effectively to cut down on the computational cost of CAMD. Using property decomposition and model transformation, this framework achieves the flexibility required for extensive molecular design along with accuracy for basic group contribution properties. The framework is based on GC+ [1] property models for thermodynamic properties like (Tc,Pc,Tm,Tb,Hv etc.). The framework is split into three stages. In the first stage, mixed-integer linear programming (MILP) is used to generate molecular candidates that fit a relaxed range of defined property targets with approximate GC+ models. This approach tackles the exponentially increasing search-space very effectively by exploiting the linearity in property models. The second stage of the framework screens the solutions further by using structure-dependent corrections to the approximate property values. Detailed chemical structures are developed using a graph theory approach. Different descriptors in different GC+ models are used by estimating their contribution via connectivity indices. The last stage is used to validate the model using molecule-specific property techniques and to integrate the CAMD problem with many involved property prediction techniques that extend its applicability to diverse problems.

We present various extensions to the framework to include other property models for various applications. These extensions range from simple groups contributions for solvent design, e.g. Hansen solubility parameter, UNIFAC, to complex property models and simulations for other diverse applications. We demonstrate the simplicity of use of the framework. Simple methodology to extend the framework for desired application for the use is presented. We conclude by presenting many case studies illustrating efficiency, wide application domain and versatility of proposed framework.

[1] R. Gani, P. M. Harper, and M. Hostrup. Automatic Creation of Missing Groups through Connectivity Index for Pure-Component Property Prediction. Industrial & Engineering Chemistry Research, 44:7262–7269, 2005.

Topics