TY - JOUR
T1 - Simplified molecular input line entry system (SMILES) as an alternative for constructing quantitative structure-property relationships (QSPR)
AU - Toropov, Andrey A.
AU - Toropova, Alla P.
AU - Mukhamedzhanova, Dilya V.
AU - Gutman, Ivan
PY - 2005/8
Y1 - 2005/8
N2 - Flexible descriptors calculated with correlation weights of fragments in the SMILES notation of molecular systems have been used as a tool for modeling normal boiling points of acyclic carbonyl compounds. Four variants of the Optimization of Correlation Weights of SMILES Fragments (OCWSF) have been examined. The difference between them is in the number of symbols in the SMILES fragments. Thus, fragments involving one-, two-, three-, and four-symbols have been examined. Correlation weights for three calculable features of SMILES are used in the OCWSF scheme: number of oxygen atoms (NO), number of double bonds (NDB), and (NO - NDB +10). In order to take into account the hydrogen bond interactions, correlation weights of these three features have been included in the OCWSF scheme. The best OCWSF model is based on three-symbol fragments together with the mentioned three features of the SMILES notation. Its statistical characteristics are: n=100, r2=0.9795, s=5.35°C, F=4673 (training set); n=100, r 2=0.9764, s=5.38°C, F=4055 (test set).
AB - Flexible descriptors calculated with correlation weights of fragments in the SMILES notation of molecular systems have been used as a tool for modeling normal boiling points of acyclic carbonyl compounds. Four variants of the Optimization of Correlation Weights of SMILES Fragments (OCWSF) have been examined. The difference between them is in the number of symbols in the SMILES fragments. Thus, fragments involving one-, two-, three-, and four-symbols have been examined. Correlation weights for three calculable features of SMILES are used in the OCWSF scheme: number of oxygen atoms (NO), number of double bonds (NDB), and (NO - NDB +10). In order to take into account the hydrogen bond interactions, correlation weights of these three features have been included in the OCWSF scheme. The best OCWSF model is based on three-symbol fragments together with the mentioned three features of the SMILES notation. Its statistical characteristics are: n=100, r2=0.9795, s=5.35°C, F=4673 (training set); n=100, r 2=0.9764, s=5.38°C, F=4055 (test set).
UR - http://www.scopus.com/inward/record.url?scp=28244474463&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=28244474463&partnerID=8YFLogxK
M3 - Article
AN - SCOPUS:28244474463
SN - 0376-4710
VL - 44
SP - 1545
EP - 1552
JO - Indian Journal of Chemistry - Section A Inorganic, Physical, Theoretical and Analytical Chemistry
JF - Indian Journal of Chemistry - Section A Inorganic, Physical, Theoretical and Analytical Chemistry
IS - 8
ER -