[gmx-users] Too much PME mesh wall time.
Yunlong Liu
yliu120 at jh.edu
Sun Aug 24 02:19:47 CEST 2014
Hi gromacs users,
I met a problem with too much PME Mesh time in my simulation. The
following is my time accounting. I am running my simulation on 2 nodes.
Each of them has 16 CPUs and 1 Tesla K20m Nvidia GPU.
And my mdrun command is ibrun
/work/03002/yliu120/gromacs-5/bin/mdrun_mpi -pin on -ntomp 8 -dlb no
-deffnm pi3k-wt-charm-4 -gpu_id 00.
I manually turned off dlb since when it is turned on, the simulation
will crash. I have reported it to both mailing lists and talked to Roland.
R E A L C Y C L E A N D T I M E A C C O U N T I N G
On 4 MPI ranks, each using 8 OpenMP threads
Computing: Num Num Call Wall time Giga-Cycles
Ranks Threads Count (s) total sum %
-----------------------------------------------------------------------------
Domain decomp. 4 8 150000 1592.099 137554.334 2.2
DD comm. load 4 8 751 0.057 4.947 0.0
Neighbor search 4 8 150001 665.072 57460.919 0.9
Launch GPU ops. 4 8 15000002 967.023 83548.916 1.3
Comm. coord. 4 8 7350000 2488.263 214981.185 3.5
Force 4 8 7500001 7037.401 608018.042 9.8
Wait + Comm. F 4 8 7500001 3931.222 339650.132 5.5
* PME mesh 4 8 7500001 40799.937 3525036.971 56.7*
Wait GPU nonlocal 4 8 7500001 1985.151 171513.300 2.8
Wait GPU local 4 8 7500001 68.365 5906.612 0.1
NB X/F buffer ops. 4 8 29700002 1229.406 106218.328 1.7
Write traj. 4 8 830 28.245 2440.304 0.0
Update 4 8 7500001 2479.611 214233.669 3.4
Constraints 4 8 7500001 7041.030 608331.635 9.8
Comm. energies 4 8 150001 14.250 1231.154 0.0
Rest 1601.588 138374.139 2.2
-----------------------------------------------------------------------------
Total 71928.719 6214504.588 100.0
-----------------------------------------------------------------------------
Breakdown of PME mesh computation
-----------------------------------------------------------------------------
PME redist. X/F 4 8 15000002 8362.454 722500.151 11.6
PME spread/gather 4 8 15000002 14836.350 1281832.463 20.6
PME 3D-FFT 4 8 15000002 8985.776 776353.949 12.5
PME 3D-FFT Comm. 4 8 15000002 7547.935 652127.220 10.5
PME solve Elec 4 8 7500001 1025.249 88579.550 1.4
-----------------------------------------------------------------------------
First, I would like to know whether this is a big problem and second, I
want to know how to improve my performance?
Does it mean that my GPU is running too fast and CPU is waiting. BTW,
what does the wait GPU nonlocal refer to?
Thank you.
Yunlong
--
========================================
Yunlong Liu, PhD Candidate
Computational Biology and Biophysics
Department of Biophysics and Biophysical Chemistry
School of Medicine, The Johns Hopkins University
Email: yliu120 at jhmi.edu
Address: 725 N Wolfe St, WBSB RM 601, 21205
========================================
More information about the gromacs.org_gmx-users
mailing list