[gmx-users] Need some precisions about logfile...

Wed Dec 7 17:09:31 CET 2005

hi,

On Wednesday 07 December 2005 12:27, Stéphane Teletchéa wrote:
> Hello,
> 
> I'm doing some benchmarking for discovering gromacs (i've used amber in 
> the past) and i'm focussing for now on gromacs3.3.
> 
> My computer (relatively modest, for gromacs' discovery) :
> 
> Pentium IV 2*2.6 Ghz SMP, 1GB of memory
> Gromacs 3.3 compiled with single/double precision with/without MPI 
> (mpich 1.2.5.2-5mdk), MandakeLinux LE2005, gcc 3.4.3-7mdk.
> 
> The ouput log file contains indications about the performance of gromacs 
> for the calculation and i'd like some precisions (i've been through the 
> archives up to 2003 already but it has not been covered as extensively 
> as i wish) :
> 
> At the end of the run (double precision shown here), i get :
> 
> Single processor calculations :
> -----------------------------------------------------------------------
> 
>                 NODE (s)   Real (s)      (%)
>         Time:    146.070    150.000     97.4
>                         2:26
>                 (Mnbf/s)   (GFlops)   (ns/day)  (hour/ns)
> Performance:      4.693      1.003      5.915      4.058
> -----------------------------------------------------------------------
> 
> 2 processors (SMP) :
> -----------------------------------------------------------------------
> 
>                 NODE (s)   Real (s)      (%)
>         Time:    166.000    166.000    100.0
>                         2:46
>                 (Mnbf/s)   (MFlops)   (ns/day)  (hour/ns)
> Performance:      4.122    883.316      5.205      4.611
> 
> Detailed load balancing info in percentage of average
> -----------------------------------------------------------------------

have you really started it on 2cpus like mpirun -np 2 ?
The information stands in first line of md*.log file

like

Log file opened on Tue Dec  6 14:21:12 2005
Host: gorgo  pid: 8475  nodeid: 3  nnodes:  4
The Gromacs distribution was built Tue Dec  6 09:18:39 CET 2005 by
bco117 at sfront03 (Linux 2.6.14.2 x86_64)

also if running on 2 cpus theres a nice load balancing info in the end of the file.

Detailed load balancing info in percentage of average
Type        NODE:  0   1 Scaling
-------------------------------
             LJ:162  37     61%
        Coulomb:200   0     50%
   Coulomb [W3]:  0 199     50%
   Coulomb + LJ:200   0     50%
Coulomb + LJ [W3]:  0 199     50%
Coulomb + LJ [W3-W3]: 95 104     95%
Outer nonbonded loop: 92 107     93%
1,4 nonbonded interactions:200   0     50%
       NS-Pairs: 99 100     99%
   Reset In Box: 99 100     99%
        Shift-X: 99 100     99%
         CG-CoM:101  98     98%
     Sum Forces: 99 100     99%
          Bonds:200   0     50%
         Angles:200   0     50%
        Propers:200   0     50%
      Impropers:200   0     50%
         Virial: 99 100     99%
         Update: 99 100     99%
        Stop-CM: 99 100     99%
      Calc-Ekin: 99 100     99%
          Lincs:200   0     50%
      Lincs-Mat:200   0     50%
   Constraint-V: 99 100     99%
 Constraint-Vir: 96 103     96%
         Settle: 95 104     95%

    Total Force: 98 101     98%

    Total Shake: 96 103     96%

Total Scaling: 98% of max performance

> 
> Question 1 :
> * i get a significant lower performance for 2 processors (i took the 
> node0 log since no other information seems there), do i simply need to 
> multiply the value for 2 processors for 2 ? e.g. :
> 1 node = 5.915 ns/day, 2 nodes = 5.205 * 2 = 10.410 ns/day ?
> (It seems the gromacs 3.3 format is different thant the one of 3.2)
> 
> Question 2 :
> * The time reported could drive the user to the conclusion (and your web 
> site states this) that a 2-cpu on a single precision run is faster than 
> the multiplication by two of the 1-cpu run.
> 
> a ) What is the rationale of this ?
> 
> b ) I think there is a contradiction about this status :
> 
> Here are my single precision results (villin):
> villin-3.3_1cpu_s             109.000        8.463
> villin-3.3_2cpus_s            102.000        8.471
> (obviously the 2-cpu run seems 8.471/8.463*100 = 100.1 %)
> 
> The thing i don't understand is the time reported for the run : 109s for 
> the mono-processor run, 102 s for the 2-cpu processors run.
> Shouldn't the second value (taken from the log file of node 0) be half 
> of the 1-cpu run (109/2 = 45.5 s) ?
> 
> Second, if i can find the 109 s value from :
> <hour of the run stop> - <hour of the run start> =
> <Tue Dec  6 17:18:08 2005> - <Tue Dec  6 17:16:19 2005> =
> 109 s
> 
> I cannot find it for the 2-CPU job (values come from the node0 logfile) :
> <hour of the run stop> - <hour of the run start> =
> <Tue Dec  6 17:18:11 2005> - <Tue Dec  6 17:19:57 2005> =
> 106 s (instead of 102 ...)
> 
> (i've taken real time not node time)
> 
> c ) How can i state a 100.1 % for the 2-CPU run over the 1 cpu run where 
> the log file itself states :
> -----------------------------------------------------------------------
> Total Scaling: 99% of max performance
> -----------------------------------------------------------------------
> 
> I understand this when i see such a statement :
> take the 1-cpu performance, multiply it by 99% and by number of cpu and 
> you'll get the total performance (8.463*0.99*2 = 16.76 ns/day instead of 
> 8.471*2 = 16.94 ps/day).
> 
> Am i right ?
> 
> Sorry to bother you about theses numbers but i want precisely to 
> understand what is stated in the log file, how it is done internally (is 
> time for multiprocessor the *cumulated* time ?).
> 
> My goal is to release a generic script for benchmarking (i've already 
> done it, the missing part is a nice graph showing performance over the 
> bench in xmgrace's format).
> 
> I've prepared the files for SMP for the moment but it will be used for a 
> cluster benchmark (i'll post the results) and i want to be sure i'm 
> doing it correctly.
> 
> Sincerely Yours,
> 
> S. Téletchéa

greetings,

florian

-- 
-------------------------------------------------------------------------------
 Florian Haberl                        Universitaet Erlangen/ Nuernberg
                                       Computer-Chemie-Centrum   
                                       Naegelsbachstr. 25
                                       D-91052 Erlangen
 Mailto: florian.haberl AT chemie.uni-erlangen.de
-------------------------------------------------------------------------------