[gmx-users] GPU-accelerated performance

Wed Sep 6 14:44:10 CEST 2017

On Wed, Sep 6, 2017 at 4:58 AM, Alex <nedomacho at gmail.com> wrote:

> Hi all,
>
> We just got the new machines that were actually built with Szilárd's
> advice (a while back) and I am doing some preliminary tests. "My" machine
> has two 22-core Xeon E5 CPUs (44 cores / 88 threads total) + 3 Titan Xp
> GPUs. So far, I got good test system performance (~11K atoms, 92 ns/day)
> from '-nt 36' and running on all GPUs. Further increasing the number of
> threads only reduces performance. The test system is a CHARMM-based
> lipid+water setup, elongated in the Z-direction (6.7 nm x 6.7 nm x 11.2
> nm).Very decent performance (70 ns/day) for 16 CPU cores and two GPUs.
>
> Any suggestions on further increase what we can squeeze out of this thing
> for a single simulation? The relevant mdp section is below (CHARMM
> defaults, really). What would you try in your mdrun line?
>
> Thanks!
>
> Alex
>
> ****
> cutoff-scheme           = Verlet
> nstlist                 = 20
> rlist                   = 1.2
> coulombtype             = pme
> rcoulomb                = 1.2
> vdwtype                 = Cut-off
> vdw-modifier            = Force-switch
> rvdw_switch             = 1.0
> rvdw                    = 1.2
>
>

You really just have to try different options for different systems. The
performance will be system-dependent. Expect larger systems to benefit more
from GPU acceleration. Try different combinations of MPI ranks and OMP
threads along with specifying the GPU's multiple times. I always do a
series of short runs with different combinations before choosing the
fastest one for a particular system and then doing a longer production run.
See http://www.gromacs.org/Documentation/Acceleration_and_parallelization

-- 
James "Wes" Barnett
Postdoctoral Research Scientist
Department of Chemical Engineering
Kumar Research Group <http://www.columbia.edu/cu/kumargroup/>
Columbia University
w.barnett at columbia.edu
http://wbarnett.us