GPS++: An Optimised Hybrid MPNN/Transformer for Molecular Property Prediction
November 18, 2022 Β· Entered Twilight Β· π arXiv.org
Repo contents: .gitignore, .style.yapf, GPSplusplus.pdf, LICENSE, Makefile, NOTICE, OGB_paper_diagram.png, README.md, argparser.py, configs, conftest.py, custom_callbacks.py, data_utils, inference.py, model, notebook_inference.ipynb, notebook_training.ipynb, notebook_utils.py, pcqm4mv2-cross_val_splits, pipeline, plotting.py, pytest.ini, requirements-dev.txt, requirements.txt, run_training.py, static_ops, tests, utils.py, xpu.py
Authors
Dominic Masters, Josef Dean, Kerstin Klaser, Zhiyi Li, Sam Maddrell-Mander, Adam Sanders, Hatem Helal, Deniz Beker, Ladislav RampΓ‘Ε‘ek, Dominique Beaini
arXiv ID
2212.02229
Category
q-bio.QM
Cross-listed
cs.LG
Citations
30
Venue
arXiv.org
Repository
https://github.com/graphcore/ogb-lsc-pcqm4mv2
β 79
Last Checked
1 month ago
Abstract
This technical report presents GPS++, the first-place solution to the Open Graph Benchmark Large-Scale Challenge (OGB-LSC 2022) for the PCQM4Mv2 molecular property prediction task. Our approach implements several key principles from the prior literature. At its core our GPS++ method is a hybrid MPNN/Transformer model that incorporates 3D atom positions and an auxiliary denoising task. The effectiveness of GPS++ is demonstrated by achieving 0.0719 mean absolute error on the independent test-challenge PCQM4Mv2 split. Thanks to Graphcore IPU acceleration, GPS++ scales to deep architectures (16 layers), training at 3 minutes per epoch, and large ensemble (112 models), completing the final predictions in 1 hour 32 minutes, well under the 4 hour inference budget allocated. Our implementation is publicly available at: https://github.com/graphcore/ogb-lsc-pcqm4mv2.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β q-bio.QM
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
GuacaMol: Benchmarking Models for De Novo Molecular Design
R.I.P.
π»
Ghosted
DeepConv-DTI: Prediction of drug-target interactions via deep learning with convolution on protein sequences
R.I.P.
π»
Ghosted
ProtVec: A Continuous Distributed Representation of Biological Sequences
R.I.P.
π»
Ghosted
A Perspective on Deep Imaging
R.I.P.
π
404 Not Found