Machine learning potentials are becoming a widespread tool for the discovery and modelling of materials at the atomic scale.One of the most important challenges in the development of such potentials is the identification of their regimes of robust accuracy.
In this article appearing on Physical Reviews B, an international team comprised of young researchers from Italy and Switzerland show how, contrary to popular assumptions, predictions from machine learning potentials almost exclusively occur in an extrapolation regime.
The authors then discuss alternative ways to identify regimes of high accuracy of machine learning potentials and propose to rationalize the robust extrapolation in terms of the probability density induced by training points.
The proposed approach exploits novel density estimation techniques based on the estimation of the intrinsic dimensionality of a dataset, which is often much lower than the dimensionality of the number of features, and allows to identify a strong correlation between the relative distribution of data in the training and test sets and the accuracy of machine learning potentials on such sets.
Claudio Zeni, Andrea Anelli, Aldo Glielmo, and Kevin Rossi, Phys. Rev. B 105, 165141
© 2022, The Author(s)