[gmx-users] Re: loab imbalance

Berk Hess gmx3 at hotmail.com
Wed Apr 7 10:43:33 CEST 2010




> From: zhao0139 at ntu.edu.sg
> To: gmx-users at gromacs.org
> Date: Tue, 6 Apr 2010 20:45:37 +0800
> Subject: [gmx-users] Re: loab imbalance
> 
> 
> > > > On 6/04/2010 5:39 PM, lina wrote:
> > > > > Hi everyone,
> > > > >
> > > > > Here is the result of the mdrun which was performed on 16cpus. I am not
> > > > > clear about it, was it due to using MPI reason? or some other reasons.
> > > > >
> > > > > Writing final coordinates.
> > > > >
> > > > >   Average load imbalance: 1500.0 %
> > > > >   Part of the total run time spent waiting due to load imbalance: 187.5 %
> > > > >   Steps where the load balancing was limited by -rdd, -rcon and/or -dds:
> > > > > X 0 % Y 0 %
> > > > >
> > > > > NOTE: 187.5 % performance was lost due to load imbalance
> > > > >        in the domain decomposition.
> > > > 
> > > > You ran an inefficient but otherwise valid computation. Check out the 
> > > > manual section on domain decomposition to learn why it was inefficient, 
> > > > and whether you can do better.
> > > > 
> > > > Mark
> > > 
> > > I search the "decomposition" keyword on Gromacs manual, no match found.
> > > Are you positive about that? Thanks any way, but can you make it more
> > > problem-solved-oriented, so I can easily understand.
> > > 
> > > Thanks and regards,
> > > 
> > > lina
> > 
> > This looks strange.
> > You have 1 core doing something and 15 cores doing nothing.
> > Do you only have one small molecule?
> > How many steps was this simulation?
> > 
> > Berk
> 
> I do not think there was only 1 core doing something and other 15 cores
> doing nothing.
> 
> Below is the time-consumed on 8 cpus and 16 cpus. I tried twice to
> compare the results. 
> 
> 8cpus:
> 	Parallel run - timing based on wallclock.
> 
>                NODE (s)   Real (s)      (%)
>        Time:  52292.000  52292.000    100.0
>                        14h31:32
>                (Mnbf/s)   (GFlops)   (ns/day)  (hour/ns)
> Performance:    523.244     19.720     16.523      1.453
> Finished mdrun on node 0 Tue Apr  6 05:09:47 2010
> 
> 16cpus:
> 
> 	Parallel run - timing based on wallclock.
> 
>                NODE (s)   Real (s)      (%)
>        Time:  96457.000  96457.000    100.0
>                        1d02h47:37
>                (Mnbf/s)   (GFlops)   (ns/day)  (hour/ns)
> Performance:    283.696     10.701      8.957      2.679
> Finished mdrun on node 0 Mon Apr  5 01:36:18 2010
> 
> Thanks and regards,
> 
> lina

The first time I did not notice that 16 cpus are twice as slow as 8.
Are you really sure you did not mix things up?
The other way around the timings would make perfect sense.
If not, there is a problem with your 16 cpu simulation.

What load imbalance is reported for the 8 cpu run?

Berk

 		 	   		  
_________________________________________________________________
New Windows 7: Find the right PC for you. Learn more.
http://windows.microsoft.com/shop
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-users/attachments/20100407/5780fd21/attachment.html>


More information about the gromacs.org_gmx-users mailing list