Yesterday, I committed a major update to the mainline repository, touching many files. This update merged in the files and changes needed for running QMCPACK on GPUs using the NVIDIA CUDA platform. Currently only a subset of QMCPACK is supported.
- Only periodic and twist-averaged boundary conditions are supported at present.
- Single Slater determinants with 3D B-spline orbitals. Only real-valued wave functions is supported, but tiling complex orbitals to supercells is supported as long as each k-point is a multiple of half a G-vector of the supercell.
- Mixed basis representation in which orbitals are represented as:
- 1D splines times spherical harmonics in spherical regions (muffin tins) around atoms
- 3D B-splines in the interstitial region
- One-body and two-body Jastrows represented as 1D B-splines are supported.
- Nonlocal pseudopotentials (semilocal form)
- Coulomb interaction
- Model Periodic Coulomb (MPC) interaction
Compiling the GPU code
- Variational Monte Carlo (VMC)
- Wave function optimization. Only Jastrow can be optimized at present
- Diffusion Monte Carlo (DMC)
To include GPU support, run cmake with the argument "-DQMC_CUDA=1". If using toolchains, add "SET(QMC_CUDA 1)" to the toolchain file. The nvcc compiler should be in your path.Using the GPU code
When running QMCPACK with GPUs, you should run one MPI process per GPU on each node. Add the argument "--gpu" to the qmcapp command line, e.g.
mpirun -np 16 qmcapp --gpu myfile.xml
Inside myfile.xml, you need to add the attribute gpu="yes" in several places:
- Inside the <determinantset> element, e.g.
- <determinantset type="einspline" href="cBN_128_V40.TN.h5"
sort="1" tilematrix="4 0 0 0 4 0 0 0 4"
twistnum="1" gpu="yes" source="i">
- Inside each of the <qmc> elements, e.g.
- <qmc method="vmc" move="pbyp" gpu="yes">
Finally, the GPU code requires many walkers to work efficiently. Presently, 128-256 walkers gets good efficiency. You may be limited by the amount of memory on the GPU card.Questions?
Email esler AT uiuc DOT edu.