[gmx-users] load imbalance!!!

Mark Abraham Mark.Abraham at anu.edu.au
Wed Apr 13 14:12:25 CEST 2011


On 13/04/2011 10:01 PM, Mark Abraham wrote:
> On 13/04/2011 7:02 PM, delara aghaie wrote:
>> Dear gromacs users
>> I have Dppc monolayer on Tip4p2005 water layer.
>> I use gromacs 4.0.7, tcoupl=v-rescale
>> command line in .ll file:
>> mpiexec mdrun -v -s topol.tpr
>> (I have made topol.tpr with the following command:
>> *grompp -c .gro -f .mdp -n .ndx -p .top -o .tpr*
>> **
>> *now submitting the .ll file with qsub, the run has finished but I 
>> see this mesaage in the .e.ll file;*
>> **
>> NOTE: Turning on dynamic load balancing
>> vol 0.69! imb F 149%
>> Writing final coordinates.
>> step 100, remaining runtime:     0 s
>>  Average load imbalance: 184.7 %
>>  Part of the total run time spent waiting due to load imbalance: 45.5 %
>>  Steps where the load balancing was limited by -rdd, -rcon and/or 
>> -dds: Z 5 %
>> NOTE: 45.5 % performance was lost due to load imbalance
>>       in the domain decomposition.
>> Is there a way to fix this load imbalance to get the better performance?
>>
>
> Start using dynamic load balancing from the beginning of the run with 
> -dlb yes.
>
>> Does it mean that because of load imbalance, I have lost almost 45 
>> percentage of performance?
>>
>
> Yes, but you simulated only 100 steps, so the time taken will be 
> dominated by I/O and setup costs.
>
> Some systems are intrinsically hard to treat in parallel. There's a 
> lot of diagnostic data before step 0, some of which might help you 
> trouble-shoot, once you demonstrate that load imbalance over a longer 
> run doesn't become less significant.

And non-load-balanced particle decomposition will do about as well as 
dynamically load-balanced domain decomposition on moderate numbers of 
processors for fairly short simulations where particles cannot diffuse 
far. You definitely have no reason at this stage to suppose that DD is 
worse than PD for your system.

It may be that for a membrane system, customizing the domain 
decomposition to suit the inhomogeneity of the particle and interaction 
density will be necessary to get best performance early in the run. This 
is why GROMACS 3.x had grompp -sort -shuffle. GROMACS 4.x will 
eventually achieve a better load-balancing result if the simulation runs 
for long enough.

Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-users/attachments/20110413/3073197e/attachment.html>


More information about the gromacs.org_gmx-users mailing list