[gmx-users] No performance increase with single vs multiple nodes
Matthew W Hanley
mwhanley at syr.edu
Wed Oct 25 04:24:04 CEST 2017
> There's several dozen lines of performance analysis at the end of the log
> file, which you need to inspect and compare if you want to start to
> understand what is going on :-)
Thank you for the feedback. Fair warning, I'm more of a system administrator than a regular gromacs user. What is it that I should be focused on, and more importantly how do I find the bottleneck? Gromacs does recommend using AVX2_256, but I was unable to get Gromacs to build using that. Here is more of the log file:
On 32 MPI ranks
Computing: Num Num Call Wall time Giga-Cycles
Ranks Threads Count (s) total sum %
-----------------------------------------------------------------------------
Domain decomp. 32 1 1666 18.920 1509.802 3.8
DD comm. load 32 1 1666 0.017 1.394 0.0
DD comm. bounds 32 1 1666 0.206 16.406 0.0
Vsite constr. 32 1 50001 4.624 369.013 0.9
Neighbor search 32 1 1667 19.646 1567.793 4.0
Comm. coord. 32 1 48334 8.291 661.640 1.7
Force 32 1 50001 339.477 27090.350 68.6
Wait + Comm. F 32 1 50001 12.691 1012.783 2.6
NB X/F buffer ops. 32 1 146669 13.563 1082.352 2.7
Vsite spread 32 1 50001 8.716 695.518 1.8
Write traj. 32 1 2 0.080 6.366 0.0
Update 32 1 50001 37.268 2973.983 7.5
Constraints 32 1 50001 25.674 2048.789 5.2
Comm. energies 32 1 5001 0.965 77.013 0.2
Rest 4.385 349.931 0.9
-----------------------------------------------------------------------------
Total 494.524 39463.132 100.0
-----------------------------------------------------------------------------?
If that's not helpful, I would need more specifics on what part of the log file would be. Failing that, if anyone could recommend some good documentation for optimizing performance I would greatly appreciate it, thank you!
-Matt
Matthew Hanley
IT Analyst
College of Engineering and Computer Science
Syracuse University
mwhanley at syr.edu
More information about the gromacs.org_gmx-users
mailing list