A smile is all you need: Predicting limiting activity coefficients from SMILES with natural language processing

June 15, 2022 Β· Declared Dead Β· πŸ› Digital Discovery

πŸ“œ CAUSE OF DEATH: Death by README
Repo has only a README

Repo contents: .gitignore, README.md

Authors Benedikt Winter, Clemens Winter, Johannes Schilling, André Bardow arXiv ID 2206.07048 Category physics.chem-ph Cross-listed cs.CL, cs.LG, q-bio.QM Citations 42 Venue Digital Discovery Repository https://github.com/Bene94/SMILES2PropertiesTransformer ⭐ 23 Last Checked 1 month ago
Abstract
Knowledge of mixtures' phase equilibria is crucial in nature and technical chemistry. Phase equilibria calculations of mixtures require activity coefficients. However, experimental data on activity coefficients is often limited due to high cost of experiments. For an accurate and efficient prediction of activity coefficients, machine learning approaches have been recently developed. However, current machine learning approaches still extrapolate poorly for activity coefficients of unknown molecules. In this work, we introduce the SMILES-to-Properties-Transformer (SPT), a natural language processing network to predict binary limiting activity coefficients from SMILES codes. To overcome the limitations of available experimental data, we initially train our network on a large dataset of synthetic data sampled from COSMO-RS (10 Million data points) and then fine-tune the model on experimental data (20 870 data points). This training strategy enables SPT to accurately predict limiting activity coefficients even for unknown molecules, cutting the mean prediction error in half compared to state-of-the-art models for activity coefficient predictions such as COSMO-RS, UNIFAC, and improving on recent machine learning approaches.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” physics.chem-ph

R.I.P. πŸ‘» Ghosted

Machine learning for molecular simulation

Frank NoΓ©, Alexandre Tkatchenko, ... (+2 more)

physics.chem-ph πŸ› Annual review of physical chemistry (Print) πŸ“š 759 cites 6 years ago

Died the same way β€” πŸ“œ Death by README