[gmx-users] Problem: parallel inefficient scaling with Mac Intel Core Duo
Yiannis
ioannis.nicolis at free.fr
Thu May 25 22:35:51 CEST 2006
Hello,
Has anybody tried to use gromacs in parallel with the Intel Core Duo
macintoshes?
I installed gromacs and lam on some iMac with intel core duo
2x1.83GHz running the latest OSX with all updates.
I have everything working OK but the scaling is very inefficient.
Here are the results of the dppc benchmark:
1 process: 1 CPU on 1 Mac 216 ps/day
2 process: 2 CPU on 1 Mac 437 ps/day
2 process: 1 CPU on each of 2 Mac 275 ps/day
4 process: 2 CPU on each of 2 Mac 499 ps/day
6 process: 2 CPU on each of 3 Mac 471 ps/day
Clearly, I have a great gain when I use both processors on the same
computer but a very small gain when I use different computers. This
is a 100 Mbs ethernet (I tested a transfer rate of 88Mbs with scp) so
it's not a network problem.
All runs with:
grompp -shuffle -sort -np #
mpirun -np # mdrun_mpi
I compiled fftw3 with
./configure --enable-float --with-gcc-arch=prescott
and lam 7.1.2 with
./configure --with-trillium --with-fc=gfortran --with-rpi=usysv --
with-tcp-short=524288 --with-rsh=ssh --with-shm-short=524288
and gromacs is 3.3.1 compiled just with
./configure --enable-mpi --program-suffix=_mpi
Is there something I can do to improve scaling?
Have you any scaling experience with such hardware to compare with my
benchmarks?
Should I try another program than lam? Openmpi? Any experience with
xgrid? Is gromacs parallelisation working with xgrid?
What is the meaning of the messages in the md#.log such as :
"Load imbalance reduced performance to 600% of max" (that's for the 6
jobs run)
and what is the meaning of the following table on md0.log).
It reports "Total Scaling: 99% of max performance" but in reality its
speed is 471ps/day for 6 processors compared to 216 for one, that
gives 471/216/6=36%. (in other words on 1 processor it takes 4006 s
and in 6 it takes 1833 s).
That's for the 6 jobs run:
Detailed load balancing info in percentage of average
Type NODE: 0 1 2 3 4 5 Scaling
-----------------------------------------------
LJ:100 99 100 99 97 101 98%
Coulomb:103 87 95 101 103 107 93%
Coulomb [W3]:141 126 97 89 71 73 70%
Coulomb [W3-W3]: 92 105 99 100 105 96 94%
Coulomb + LJ:105 91 96 100 101 105 94%
Coulomb + LJ [W3]:168 134 100 83 57 55 59%
Coulomb + LJ [W3-W3]: 94 103 99 99 103 98 96%
Outer nonbonded loop:111 94 92 93 91 116 85%
1,4 nonbonded interactions:100 100 100 100 99 99 99%
NS-Pairs:100 100 99 99 99 100 99%
Reset In Box:100 100 100 100 99 99 99%
Shift-X:100 100 100 100 99 99 99%
CG-CoM:100 100 100 100 99 99 99%
Sum Forces:100 100 99 99 99 100 99%
Angles:100 100 100 100 99 99 99%
Propers:100 100 100 100 99 99 99%
Impropers:100 100 100 100 99 99 99%
RB-Dihedrals:100 100 100 100 99 99 99%
Virial:100 100 100 100 99 99 99%
Update:100 100 100 100 99 99 99%
Stop-CM:100 100 100 100 99 99 99%
Calc-Ekin:100 100 100 100 99 99 99%
Lincs:100 100 100 100 99 99 99%
Lincs-Mat:100 100 100 100 99 99 99%
Constraint-V:100 100 100 100 99 99 99%
Constraint-Vir:100 100 100 100 99 99 99%
Settle: 99 99 99 99 100 100 99%
Total Force:100 100 98 99 100 99 99%
Total Shake: 99 99 99 99 100 100 99%
Total Scaling: 99% of max performance
Thanks for any help,
Ioannis
More information about the gromacs.org_gmx-users
mailing list