30 April 2021

Generating neural network potentials

The scope of this work is the generation of a general purpose potential that can deal - in an unbiased way - with very different carbon polymorphs. At the present time, it is possible to develop custom tailored potential for materials in specific crystal structures, but their accuracy is  not transferable to other conformations. When searching for new or unknown crystal structures to satisfy a critical requirement, the robustness of the prediction can be achieved (at least empirically) by training the potential on a very broad data set comprising all possibly relevant local atomic environments.

The team worked on the topic on top of the development of the PANNA code (PANNA: Properties from Artificial Neural network Architectures, Comput Phys Commun 256, 107402 (2020)). The international collaboration involved Yusuf Shaidu (former SISSA PhD Student, now PostDoc at UC Berkeley), Emine Kucukbenli (former SISSA PostDoc, now Postdoc at Harvard University), Ruggero Lot (Joint PhD student at SISSA and Toulouse University), Franco Pellegrini (former SISSA PostDoc, now PostDoc at Ecole Normale Superieure, Paris), Efthimios Kaxiras (full professor at Harvard) and Stefano de Gironcoli (SISSA)

The scientists chose the approach of combining the crystal structure phase space exploration via evolutionary algorithms and the Neural Network training in a self consistent way. Thereby, it is possible to develop a general carbon potential that reproduces a number of physical properties in different crystal structures, serving as a basis for a search of novel graphene-derived anode for Li-ion batteries.

The computationally intensive part of the procedure is the evaluation of the energy of the training configurations via state of the art DFT calculations. These sets typically comprise thousands of new structures at each self consistent iteration. The efficient evaluation of these energies exploiting future exascale resources can be obtained via high-throughput protocols thanks to the fact that each configuration is independent and can be evaluated in parallel. The integration of this idea in the MaX ecosystem makes this part of a MaX demonstrator activity.


Figure - The self-consistent scheme. The initial step to start the process (yellow arrow) can be performed with a classical force field as shown here, or any comprehensive dataset of structures such as the ones in Aflowlib, Materials Genome Initiative, or Nomad repositories can be used to generate the first neural network potential model (blue triangle) to be refined through the self-consistent cycle. Once an initial potential model is chosen, an evolutionary algorithm enables a diverse set of structures to be sampled. The following clustering-based pruning of structures further ensures that no single polymorph biases the dataset, i.e., at each step only novel structures (red and blue disks for the particular step highlighted above) are to be considered, further refined, and added to the dataset. The subsequent MD simulations sample the potential energy surface of each polymorph. Finally, DFT calculations performed on a subset of MD-sampled structures are added to the ab-initio dataset obtained thus far. The ab initio dataset augmented this way is then used to train the next neural network potential model (a darker blue triangle), starting the next cycle of the self-consistent scheme until no new structures are found by the evolutionary algorithm.

Shaidu, Y., Küçükbenli, E., Lot, R. et al. A systematic approach to generating accurate neural network potentials: the case of carbon. npj Comput Mater 7, 52 (2021).

DOI: 10.1038/s41524-021-00508-6