(365d) Grid Evaluation of Pure-Compound Properties | AIChE

(365d) Grid Evaluation of Pure-Compound Properties

Authors 

Diky, V. - Presenter, National Institute of Standards and Technology
Kazakov, A., National Institute of Standards and Technology
Kroenlein, K., National Institute of Standards and Technology
Holistic and internally consistent evaluation of property data is the essential idea behind the NIST ThermoData Engine (TDE, Standard Reference Database 103b). Developed at the Thermodynamics Research Center (TRC), TDE enforces the thermodynamic consistency of all recommendations for a given compound, either implicitly with an equation of state or explicitly so models agree within uncertainties. However, evaluating the properties of a compound without considering the behaviors of related compounds creates a situation where recommendation quality become susceptible to reporting errors or natural experimental variability, as well as inconsistencies of prediction methods should method selection vary across different compounds. Discrepancies that exceed engineering need requirements are often observed between different prediction schemes or between experiments and predictions; thus heterogenous data scenarios can interfere with good chemical design.

An automated method for identifying inconsistencies in a family of compounds has been developed. It is based on building of all possible series involving the compound of interest, where the same change in a molecular structure is repeated. As the analysis of a molecular structure is complex, descriptors expressing molecular structure are used as surrogate variables. For example, such a descriptor might be a decomposition for a group-contribution prediction method or a component used in a molecular structure similarity assessment. Then a one-dimension data analysis is applied to each pair of vectors representing the molecular descriptor and property value. Extending this analysis to multiple dimensions is also possible.

Once this web of relationships has been defined, analysis can be performed for any property that is expected to show regular variation with structure (with exceptions for smallest members and odd-even effects). For that purpose, a grid involving compounds with the most reliable values, as defined by expert analysis, can be built and gradually extended with the use of predicted values if necessary. The whole grid can then be iteratively refined as confidence can be gained in newly analyzed data. The properties of which consistency is most important are critical parameters and normal boiling points or acentric factors. The proposed method correlates with previously published work, where reparameterization of prediction methods is performed for each compound on the basis of the data for the set of most similar compounds with known properties. Its application should increase the reliability and stability of property data evaluation for pure compounds.