SIESTA is a first-principles materials simulation program based on density-functional theory (DFT). It was one of the first codes to enable the treatment of large systems with first-principles electronic-structure methods, which opened up new research avenues in many disciplines. SIESTA has become quite popular and is increasingly being used by researchers in geosciences, biology and engineering (apart from those in its natural habitats of materials physics and chemistry). SIESTA's efficiency stems from the use of strictly localized basis sets and from the implementation of linear-scaling algorithms which can be applied to suitable systems. A very important feature of the code is that its accuracy and cost can be tuned in a wide range, from quick exploratory calculations to highly accurate simulations matching the quality of other approaches, such as plane-wave methods.
Apart from the increased efficiency brought by the reduced cardinality of an atomic-orbital basis set, finite-support basis orbitals are the key for calculating the (sparse) Hamiltonian and overlap matrices in O(N) operations, and for the use of reduced-scaling algorithms. SIESTA uses the standard Kohn-Sham self-consistent density functional method in the local density (LDA-LSD) or generalized gradient approximations (GGA), and also implements functionals capable of describing van der Waals interactions. It employs norm-conserving pseudopotentials in their fully nonlocal (Kleinman-Bylander) form. Projects the wavefunctions and density onto a real-space grid in order to calculate the Hartree and exchange-correlation potentials and their matrix elements.
Diffusion
The SIESTA program has been always free for academics, and is now distributed under the GPL license. Currently there are several thousand users all over the world, and the paper describing the method [1] has received more than 7,500 citations so far. A number of schools have been organized over the years, and the materials made available online. The development of SIESTA is currently centralized in the Launchpad platform, but it will be moving soon to the GitLab platform.
Performance and Scaling in Parallel Computation environments
Two main phases can be distinguished in a typical electronic-structure calculation: the setup of the Hamiltonian, which scales as O(N) with the size of the system, and deals mostly with sparse data structures, and the solving of the electronic structure, whose scaling can range from linear to cubic, depending on the algorithm. Parallelization of the ‘setup’ phase is mainly done with MPI, with some sections also parallelized with OpenMP. Data distribution is done over orbitals and over grid points. The solver stage, involving the generation of the density-matrix (via diagonalization or via a formal application of the Fermi-Dirac function to the Hamiltonian, as in the PEXSI solver) is the most time-consuming step in the operation of the code. Nevertheless, the use of an atomic-orbital basis means that the matrix sizes involved are typically much smaller than in plane-wave codes. This makes SIESTA, even in the cubic-scaling diagonalization mode, very efficient and competitive in time-to-solution for medium-to-large systems with relatively modest processor counts (Figure 1).
Fig. 1. Strong scaling (speedup) for a Siesta run with diagonalization for a system of 2048 molecules of water. The S value indicates the effective “serial fraction” of the calculation (for a definition of S, see [2])
The PEXSI solver [3] scales at most quadratically with system size by exploiting the sparsity of the Hamiltonian and other matrices involved. It can be applied to all kinds of systems, including metals, and its accuracy is comparable to that of cubic-scaling full diagonalization. PEXSI can use large numbers of processors efficiently due to the near-perfect parallelization over poles and the good scaling of the pole-specific operations, thus enabling the treatment of very large systems in high-performance machines (Figure 2). Apart from the native interface to PEXSI in SIESTA (which was the first to be available in a mainstream code), the PEXSI solver is also available through the implementation of an interface to the ELSI library. With almost no code changes, SIESTA can now use an expanding set of modern solvers with a unified calling sequence. Another advantage of the PEXSI solver is its radically reduced memory footprint. In Figure 2, a comparison with diagonalization is only possible when using over 1,000 processors. Otherwise the dense matrix would not fit in the (distributed) memory.
Fig. 2. Parallel strong scaling of SIESTA-PEXSI and the (Scalapack) diagonalization approach when they are applied to a DNA chain and a Graphene-Boron Nitride stack, prototypes of quasi one-dimensional and two-dimensional systems. “ppp” stands for the number of MPI processes used in each pole computation, and the nearly straight lines represent increasing levels of parallelization over poles.
The PEXSI method has a relatively large pre-factor compared to diagonalization, so it is only advisable for systems of moderately large size (Figure 3). However, constant improvements to the underlying algorithms (such as the need for fewer poles, or yet more parallelization levels) are bringing the break-even points to smaller system sizes.
Fig. 3. Comparison of the total cost of a computation of systems of various dimensionalities, using the PEXSI solver and standard diagonalization, as a function of system size.
The TranSiesta module uses advanced algorithms and incorporates a nearly complete dual MPI/OpenMP parallelization. In particular, a new pivoting algorithm (BTD:block tri-diagonal) has unlocked a dramatic performance increase over previous versions using more standard approaches (Figure 4).
Fig. 4. Speedups in core TranSiesta computations brought about by the use of the BTD (block tridiagonal) pivoting algorithm.
References
[1] J. M. Soler, E. Artacho, J. D. Gale, A. Garcia, J. Junquera, P. Ordejon, and D. Sanchez-Portal, The SIESTA method for ab initio order-n materials simulation, J. Phys.: Condens. Matter 14, 2745 (2002).
[2] F. Corsetti, Performance Analysis of Electronic Structure Codes on HPC Systems: A Case Study of SIESTA. PLoS ONE 9(4): e95390 (2014).
[3] L. Lin, A. Garcia, G. Huhs, and C. Yang, Siesta-PEXSI: Massively parallel method for efficient and accurate ab initio materials simulation without matrix diagonalization, J. Phys.: Condens. Matter 26, 305503 (2014).
SIESTA is a first-principles materials simulation code based on density-functional theory (DFT). It uses atomic orbitals with finite support as a basis set, allowing unlimited multiple-zeta and angular momenta, polarization and off-site orbitals. Apart from the increased efficiency brought by the reduced cardinality of an atomic-orbital basis set, finite-support basis orbitals are the key for calculating the (sparse) Hamiltonian and overlap matrices in O(N) operations, and for the use of reduced-scaling algorithms. SIESTA uses the standard Kohn-Sham self-consistent density functional method in the local density (LDA-LSD) or generalized gradient (GGA) approximations, and also implements functionals capable of describing van der Waals interactions. It employs norm-conserving pseudopotentials in their fully nonlocal (Kleinman-Bylander) form. SIESTA uses a real-space grid in order to calculate the Hartree and exchange-correlation potentials and their matrix elements.
The SIESTA package includes a TranSiesta module to enable electronic-transport calculations within the non-equilibrium Green’s functions formalism.
SIESTA can provide:
Software
SIESTA is a Fortran code to perform materials simulations within DFT. Two main phases can be distinguished in a typical electronic-structure calculation: the setup of the Hamiltonian, which scales as O(N) with the size of the system, and deals mostly with sparse data structures, and the solving of the electronic structure, whose scaling can range from linear to cubic, depending on the algorithm. Parallelization of the ‘setup’ phase is mainly done with MPI, with some sections also parallelized with OpenMP. Data distribution is done over orbitals and over grid points. The solver stage is typically the most time consuming for large systems. The default solver, and the most efficient for moderately-sized systems, is based on diagonalization, with an initial conversion of the sparse matrices to dense form. Other solvers (i.e., PEXSI, CheSS) deal directly with the sparse matrices and have thus a smaller memory footprint. For massively parallel execution, the PEXSI solver offers various levels of parallelization and is able to scale to tens of thousands of processes, showing also a reduced complexity scaling, which can reach O(N) for quasi-one-dimensional systems. The TranSiesta module uses advanced algorithms and incorporates a nearly complete dual MPI/OpenMP parallelization.
Particular Libraries
SIESTA is an open source code distributed under the GNU General Public Licence. Development is currently hosted in the Launchpad platform, but it will move in the Fall of 2019 to the Gitlab platform. SIESTA works with Fortran (and C for some functionality) and uses standard optimized libraries such as BLAS, LAPACK, SCALAPACK, and netCDF/HDF5. Some optional functionality depends on domain-specific libraries such as ELSI, CheSS, libXC, and/or the embedding of a (lightweight) Lua interpreter together with associated glue libraries. Note that some of the functional modules within SIESTA are being made stand-alone and released as library packages within the Electronic Structure Library (ESL).
Parallel programming
As mentioned above, the main parallelization paradigm is MPI, with some modules (notably TranSiesta) offering also parallelization using OpenMP. The solver stage, when using standard diagonalization algorithms, can also benefit from “library-mode” OpenMP.
I/O requirements
Checkpointing is based on the saving of the density matrix, which is a sparse data object in SIESTA, and hence moderate in size. Some modes of operation require the saving of (a subset of) the wavefunctions, which is a more substantial demand, but still manageable due to the low-cardinality of SIESTA’s basis set.