Tomas Öberg Konsult AB

Home : Qualifications : Lectures : Abstract

Boiling points of halogenated aliphatic compounds: A quantitative structure-property relationship for prediction and validation
Öberg, T.
Presentation at the Third Indo-US Workshop on Mathematical Chemistry, Duluth, Minnesota, USA, August 2-7, 2003.

Abstract
Halogenated aliphatic compounds have many technical uses, but substances within this group are also ubiquitous environmental pollutants that can affect the ozone layer and contribute to global warming. The establishment of quantitative structure-property relationships (QSPR) is of interest not only to fill gaps in the available database, but also to validate already acquired experimental data.

This study was undertaken with the purpose to model the relationship between computationally derived molecular descriptors and experimentally determined normal boiling points (NBP). The three-dimensional structures of 240 compounds, with boiling points between 191-462 K, were modeled with molecular mechanics prior to the generation of 1175 empirical descriptors. Two bilinear projection methods, principal component analysis (PCA) and partial least squares regression (PLSR), were used to identify 19 outliers. The remaining 221 objects were randomly assigned into a calibration set of 146 objects and a test set of 75 objects.

PLSR was subsequently used to build a multivariate calibration model by extracting the latent variables that describe most of the covariation between the molecular structure and the boiling point. A jack-knifing procedure was used to select 511 descriptor variables for inclusion into the PLSR model. The standard error of prediction (SEP) for the test set was 6.4 K and the average absolute error was 4.8 K, and both these error estimates are in close agreement with the anticipated lower bound experimental error from other studies. The physical-chemical interpretation of the main latent variables indicated as expected that the main source of variation in the boiling point can be attributed to molecular size and polarizability.

Recalibration with implicit non-linear latent variable regression (INLR) did not improve the calibration result, thus suggesting that the bilinear model is indeed sufficient. Boiling points were also estimated with an extension of the group contribution method of Stein and Brown, and the average absolute error was then slightly above 20 K. The group contribution method was developed for a general applicability, and consequently lack some accuracy and precision when considering local phenomena.

Finally, prediction results for the initially removed outliers were compared with data from another source. These additional experimental data were in good agreement with the PLSR model, thus demonstrating that this is a viable approach. 92.5% of the originally reported data values were validated either by the model or with experimental data reported elsewhere.

Slides as a PDF file, 84 kbPDF


På svenskaSwedish homepage

© Tomas Öberg Konsult AB  Site Map
 Contact