[gmx-users] [Gromacs 3.3.3] tests for parallel - is this reasonable?

Thomas Schlesier schlesi at uni-mainz.de
Fri Oct 2 19:11:31 CEST 2009


Hi all,
i have done some small tests for parallel calculations with different 
systems:
All simulations were done on my laptop which has a dualcore CPU.

(1) 895 waters - 2685 atoms, for 50000 steps (100ps)
cutoffs 1.0nm (no pme); 3nm cubic box
single:
               NODE (s)   Real (s)      (%)
       Time:    221.760    222.000     99.9
                       3:41
               (Mnbf/s)   (GFlops)   (ns/day)  (hour/ns)
Performance:     13.913      3.648     38.961      0.616
parallel (2 cores):
               NODE (s)   Real (s)      (%)
       Time:    160.000    160.000    100.0
                       2:40
               (Mnbf/s)   (GFlops)   (ns/day)  (hour/ns)
Performance:     19.283      5.056     54.000      0.444
Total Scaling: 98% of max performance

=> 1.386 times faster

(2) 3009 waters - 9027 atoms, for 50000 steps (100ps)
cutoffs 1.0nm (no pme); 4.5nm cubic box
single:
               NODE (s)   Real (s)      (%)
       Time:    747.830    751.000     99.6
                       12:27
               (Mnbf/s)   (GFlops)   (ns/day)  (hour/ns)
Performance:     13.819      3.617     11.553      2.077
parallel (2cores):
               NODE (s)   Real (s)      (%)
       Time:    525.000    525.000    100.0
                       8:45
               (Mnbf/s)   (GFlops)   (ns/day)  (hour/ns)
Performance:     19.684      5.154     16.457      1.458
Total Scaling: 98% of max performance

=> 1.424 times faster

(3) 2 waters
rest same as (1)
single:
               NODE (s)   Real (s)      (%)
       Time:      0.680      1.000     68.0
               (Mnbf/s)   (MFlops)   (ns/day)  (hour/ns)
Performance:      0.012    167.973  12705.884      0.002
parallel:
               NODE (s)   Real (s)      (%)
       Time:      9.000      9.000    100.0
               (Mnbf/s)   (MFlops)   (ns/day)  (hour/ns)
Performance:      0.003     17.870    960.000      0.025
Total Scaling: 88% of max performance

=> about 10 times slower
(this one was more a test to see how the values look for a case where 
parallelisation is a waste)

So now my questions:
1) Are the values reasonable (i mean not really each value, but more the 
speed difference between parallel and single)? I would have assumed that 
if the system is big (2) i'm with two cores about a factor of a little 
bit less then 2 faster, and not only around 1.4 times

2) In the md0.log files (for parallel runs) i have seen for all three 
simulations the following line:
"Load imbalance reduced performance to 200% of max"
What does it mean? And why is it in all three cases the same?

3) What does the "Total Scaling" mean? In case (3) i'm with single 10 
times better, but for parallel it says i have 88% of max performance (If 
i set single to 100%, it would only be 10% performance).


Hope someone can help me. Especially the first question is for me the 
most interesting.
Thank you for your answers.
Greetings,
Thomas



More information about the gromacs.org_gmx-users mailing list