[gmx-users] About system requirement to gromacs

Mark Abraham Mark.Abraham at anu.edu.au
Wed Aug 1 14:40:03 CEST 2012


On 1/08/2012 8:36 PM, rama david wrote:
> Thank you Mark for reply......
>
> I run mdrun and mpirun with following command. I pasted output also..
> Please help me to parse it..
>
>
> 1.   mdrun -v -deffnm topol1
> 2.   mpirun -np 4 mdrun -v -deffnm topol1
>
>
> 1.    mdrun -v -deffnm topol1
>
>
> step 30, will finish Wed Aug  1 16:49:28 2012
>   Average load imbalance: 12.3 %
>   Part of the total run time spent waiting due to load imbalance: 5.1 %
>
> NOTE: 5.1 % performance was lost due to load imbalance
>        in the domain decomposition.
>        You might want to use dynamic load balancing (option -dlb.)
>
>
> 	Parallel run - timing based on wallclock.
>
>                 NODE (s)   Real (s)      (%)
>         Time:      2.035      2.035    100.0
>                 (Mnbf/s)   (GFlops)   (ns/day)  (hour/ns)
> Performance:    109.127      5.744      2.632      9.117
>
> gcq#98: "You're About to Hurt Somebody" (Jazzy Jeff)

Here you let your threaded mdrun use the default for the -nt flag of -1, 
which lets it use all the available cores...

>
>
>
> 2. mpirun -np 4 mdrun -v -deffnm topol1

... and here you used mpirun to start four copies of *threaded* 
*non-MPI* mdrun, which each tried to do something sensible...

>
> Getting Loaded...
> Reading file topol1.tpr, VERSION 4.5.5 (single precision)
> Starting 4 threads
> Starting 4 threads
> Starting 4 threads
> Starting 4 threads
> Loaded with Money
>
> Loaded with Money
>
> Loaded with Money
>
> Loaded with Money
>
> Making 1D domain decomposition 4 x 1 x 1
> Making 1D domain decomposition 4 x 1 x 1
>
>
> Making 1D domain decomposition 4 x 1 x 1
> Making 1D domain decomposition 4 x 1 x 1

... and here you can see the four independent runs all writing their own 
output...

>
> starting mdrun 'Protein in water'
> 50000 steps,    100.0 ps.
> starting mdrun 'Protein in water'
> 50000 steps,    100.0 ps.
>
> starting mdrun 'Protein in water'
> 50000 steps,    100.0 ps.
> starting mdrun 'Protein in water'
> 50000 steps,    100.0 ps.
>
> NOTE: Turning on dynamic load balancing
>
>
> NOTE: Turning on dynamic load balancing
>
> step 0
> NOTE: Turning on dynamic load balancing
>
> step 100, will finish Wed Aug  1 19:36:10 2012vol 0.83  imb F  2% vol
> 0.84  imb step 200, will finish Wed Aug  1 19:32:37 2012vol 0.87  imb
> F 16% vol 0.86  imb step 300, will finish Wed Aug  1 19:34:59 2012vol
> 0.88  imb F  4% vol 0.85  imb step 400, will finish Wed Aug  1
> 19:36:27 2012^Cmpirun: killing job...

... and here they are writing output that wasn't meant to look pretty in 
a silly usage case.

>
> --------------------------------------------------------------------------
> mpirun noticed that process rank 0 with PID 4257 on node  VPCEB34EN
> exited on signal 0 (Unknown signal 0).
> --------------------------------------------------------------------------
> 4 total processes killed (some possibly by mpirun during cleanup)
> mpirun: clean termination accomplished
>
>
>
>
> As you can also see the mdun command estimate to complete Aug  1 16:49:28 2012
> while mpirun taking the time Wed Aug  1 19:36:10 2012vol....
>
> Mpirun command taking more time...

Your job took two seconds to run, so expecting it to form a reliable 
estimate of how long it will take to finish based on its current 
progress is a bit tough. For all we know, you were over-allocating 
processes with mpirun and threads, too.

>
> so from above output I can  guess In mpirun 4 processor are used

Read the .log files if you want to know what they were doing.

Mark



More information about the gromacs.org_gmx-users mailing list