[gmx-users] Need some precisions about logfile...
Stéphane Teletchéa
steletch at jouy.inra.fr
Wed Dec 7 12:27:59 CET 2005
I'm doing some benchmarking for discovering gromacs (i've used amber in
the past) and i'm focussing for now on gromacs3.3.
My computer (relatively modest, for gromacs' discovery) :
Pentium IV 2*2.6 Ghz SMP, 1GB of memory
Gromacs 3.3 compiled with single/double precision with/without MPI
(mpich, MandakeLinux LE2005, gcc 3.4.3-7mdk.
The ouput log file contains indications about the performance of gromacs
for the calculation and i'd like some precisions (i've been through the
archives up to 2003 already but it has not been covered as extensively
as i wish) :
At the end of the run (double precision shown here), i get :
Single processor calculations :
NODE (s) Real (s) (%)
Time: 146.070 150.000 97.4
(Mnbf/s) (GFlops) (ns/day) (hour/ns)
Performance: 4.693 1.003 5.915 4.058
2 processors (SMP) :
NODE (s) Real (s) (%)
Time: 166.000 166.000 100.0
(Mnbf/s) (MFlops) (ns/day) (hour/ns)
Performance: 4.122 883.316 5.205 4.611
Detailed load balancing info in percentage of average
Question 1 :
* i get a significant lower performance for 2 processors (i took the
node0 log since no other information seems there), do i simply need to
multiply the value for 2 processors for 2 ? e.g. :
1 node = 5.915 ns/day, 2 nodes = 5.205 * 2 = 10.410 ns/day ?
(It seems the gromacs 3.3 format is different thant the one of 3.2)
Question 2 :
* The time reported could drive the user to the conclusion (and your web
site states this) that a 2-cpu on a single precision run is faster than
the multiplication by two of the 1-cpu run.
a ) What is the rationale of this ?
b ) I think there is a contradiction about this status :
Here are my single precision results (villin):
villin-3.3_1cpu_s 109.000 8.463
villin-3.3_2cpus_s 102.000 8.471
(obviously the 2-cpu run seems 8.471/8.463*100 = 100.1 %)
The thing i don't understand is the time reported for the run : 109s for
the mono-processor run, 102 s for the 2-cpu processors run.
Shouldn't the second value (taken from the log file of node 0) be half
of the 1-cpu run (109/2 = 45.5 s) ?
Second, if i can find the 109 s value from :
<hour of the run stop> - <hour of the run start> =
<Tue Dec 6 17:18:08 2005> - <Tue Dec 6 17:16:19 2005> =
109 s
I cannot find it for the 2-CPU job (values come from the node0 logfile) :
<hour of the run stop> - <hour of the run start> =
<Tue Dec 6 17:18:11 2005> - <Tue Dec 6 17:19:57 2005> =
106 s (instead of 102 ...)
(i've taken real time not node time)
c ) How can i state a 100.1 % for the 2-CPU run over the 1 cpu run where
the log file itself states :
Total Scaling: 99% of max performance
I understand this when i see such a statement :
take the 1-cpu performance, multiply it by 99% and by number of cpu and
you'll get the total performance (8.463*0.99*2 = 16.76 ns/day instead of
8.471*2 = 16.94 ps/day).
Am i right ?
Sorry to bother you about theses numbers but i want precisely to
understand what is stated in the log file, how it is done internally (is
time for multiprocessor the *cumulated* time ?).
My goal is to release a generic script for benchmarking (i've already
done it, the missing part is a nice graph showing performance over the
bench in xmgrace's format).
I've prepared the files for SMP for the moment but it will be used for a
cluster benchmark (i'll post the results) and i want to be sure i'm
doing it correctly.
Sincerely Yours,
S. Téletchéa
Stéphane Téletchéa, PhD. http://www.steletch.org
Unité Mathématique Informatique et Génome http://migale.jouy.inra.fr/mig
INRA, Domaine de Vilvert Tél : (33) 134 652 121 / 3086
78352 Jouy-en-Josas cedex, France Fax : (33) 134 652 901
More information about the gromacs.org_gmx-users
mailing list