[gmx-developers] Possible impact on load balancing

Berk Hess hess at kth.se
Wed May 2 14:13:21 CEST 2012


Hi,

Load balancing is done on the time spent in the force calculation and 
then only
on the part that doesn't communicate.
So your modification is outside the scope of load balancing.

Cheers,

Berk

On 05/02/2012 02:10 PM, Matthieu.Dreher at imag.fr wrote:
> Hi all,
>
> I am currently working on the integration of Gromacs 4.5 with a 
> middleware (FlowVR) design to enable modular programming in distribute 
> context. In our application, Gromacs is a module which produce data 
> (the positions of the atoms) for the other module.
>
> To do so, we added two functionalities. The first one is the 
> construction of a message on each node which include the positions of 
> the local atoms and send them on the network. The second (wait() 
> function) is a mandatory function of the middleware we use and is the 
> first function call in the while(!bLastStep) loop.
>
> To resume, we have something like this :
> while(!bLastStep){
>      wait(); //Mandatory for FlowVR
>      ... //Proceed first steps of Gromacs
>      /*********
>      *  output sections (write_traj, etc...)
>      **********/
>      build_and_send_pos() //Copy of the home atoms in a buffer and 
> send it on the network
>      ..... //Proceed second steps of Gromacs
> }
>
> In our first observations, we found that the construction and send of 
> the messages has a very low and stable cost but the wait() function 
> has a more "random" cost at least in time cost. The relative cost of 
> this function can vary from 1 to 10 on some iteration and can 
> represent up to 10% of the time computation of a step.
>
> We test our system with different large molecular systems (100K -> 1M7 
> atoms) with and without our middleware to evaluate the cost of our 
> middleware. We found that with a small number of cores (~50), the 
> difference of performances of gromacs between with or without the 
> middleware can be explain by the two functionnalities we have 
> introduced into Gromacs. But when the number of cores increase, the 
> impact on the performance become very important (-30% with ~100 core 
> and -50% with ~400 cores). Our cluster is composed of 2x Xeon quadcore 
> on each machine with an infiniband connection between each machine.
>
> When visualizing the traces, we saw that the wait() function become 
> very unstable and can cost 10% of an iteration time. Yet this is not 
> sufficient to explain the important performance loss.
> I do have an hypothesis to explain at least a part of the drop of 
> performances and I would like to know if it is possible for you (or not).
> According to the Gromacs 4 article, the load balancing mechanism is 
> based on timing. I was wondering if somehow gromacs could see the 
> wait() operation in his timings and, as the wait() can vary greatly on 
> the same iteration for each node, gromacs could see this as an 
> imbalance problem which it tries to rebalance by redistributing the 
> atoms (which wouldn't be good because it's not an atoms related 
> imbalance) which would cause more imbalance etc causing a snowball 
> effect.
>
> I'm probably not very clear on some points, feel free to ask more 
> details on precise points.
>
> Thank you for your time.
>
> Dreher Matthieu




More information about the gromacs.org_gmx-developers mailing list