[gmx-users] gromacs 3.3.3 vs 4.0.3 performance
David van der Spoel
spoel at xray.bmc.uu.se
Fri Jan 30 18:53:03 CET 2009
Dimitris Dellis wrote:
>
> Justin A. Lemkul wrote:
>>
>>
>> Dimitris Dellis wrote:
>>> Hi.
>>> I run the same (exactly) simulations with v3.3.3 and v4.0.3, on the
>>> same 64bit Q6600/DDR2-1066 machine, gcc-4.3.2 ,fftw-3.2.
>>> I found that the performance of 4.0.3 is roughly 30% lower than 3.3.3
>>> (30% higher hours/ns), for few systems (512 molecules of 5-15 sites,
>>> nstlist=10) I tried.
>>> This happens with single precision serial and parallel, np=2,4
>>> (openmpi 1.3) versions and only when electrostatics (PME) are present.
>>> With Simple LJ potentials the performance is exactly the same.
>>> Is there any speed comparison 3.3.3 vs 4.0.3 available ?
>>> D.D.
>>>
>>
>> Can you show us your .mdp file? What did grompp report about the
>> relative PME load? These topics have been discussed a few times;
>> you'll find lots of pointers on optimizing performance in the list
>> archive.
try turning off optimize_fft
>>
>> -Justin
>>
> Hi Justin,
> These are from the small system, no I/O only 1k steps.
> grompp.mdp
> ===========
> integrator = md
> dt = 0.0010
> nsteps = 1000
> nstxout = 0
> nstvout = 0
> nstlog = 1000
> nstcomm = 10
> nstenergy = 0
> nstxtcout = 0
> nstlist = 10
> ns_type = grid
> dispcorr = AllEnerPres
> tcoupl = berendsen
> tc-grps = System
> ref_t = 293.15
> gen_temp = 293.15
> tau_t = 0.2
> gen_vel = no
> gen_seed = 123456
> constraints = none
> constraint_algorithm = shake
> dispcorr = AllEnerPres
> energygrps = System
> rlist = 1.6
> vdw-type = Cut-off
> rvdw = 1.6
> coulombtype = PME
> fourierspacing = 0.12
> pme_order = 4
> ewald_rtol = 1.0e-5
> optimize_fft = yes
> rcoulomb = 1.6
>
> related 4.0.3 grompp output
> Estimate for the relative computational load of the PME mesh part: 0.19
>
> 4.0.3 mdrun serial timings (near zero omitted)
>
> Coul(T) + LJ 576.513824 31708.260 71.5
> Outer nonbonded loop 8.489390 84.894 0.2
> Calc Weights 6.006000 216.216 0.5
> Spread Q Bspline 128.128000 256.256 0.6
> Gather F Bspline 128.128000 1537.536 3.5
> 3D-FFT 1088.769682 8710.157 19.6
> Solve PME 18.531513 1186.017 2.7
>
> parallel 4.0.3 np=4
> Average load imbalance: 5.2 %
> Part of the total run time spent waiting due to load imbalance: 2.3 %
> Performance: 96.086 7.380 14.414 1.665
>
> 3.3.3 mdrun serial timings
> Coul(T) + LJ 576.529632 31709.129760 72.0
> Outer nonbonded loop 8.487860 84.878600 0.2
> Spread Q Bspline 128.128000 256.256000 0.6
> Gather F Bspline 128.128000 1537.536000 3.5
> 3D-FFT 1088.769682 8710.157456 19.8
> Solve PME 17.986469 1151.133984 2.6
>
> parallel 3.3.3 np=4
> Performance: 144.132 12.556 21.600 1.111
>
> D.D.
>
>>>
>>> ------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> gmx-users mailing list gmx-users at gromacs.org
>>> http://www.gromacs.org/mailman/listinfo/gmx-users
>>> Please search the archive at http://www.gromacs.org/search before
>>> posting!
>>> Please don't post (un)subscribe requests to the list. Use the www
>>> interface or send it to gmx-users-request at gromacs.org.
>>> Can't post? Read http://www.gromacs.org/mailing_lists/users.php
>>
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> gmx-users mailing list gmx-users at gromacs.org
> http://www.gromacs.org/mailman/listinfo/gmx-users
> Please search the archive at http://www.gromacs.org/search before posting!
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-request at gromacs.org.
> Can't post? Read http://www.gromacs.org/mailing_lists/users.php
--
David van der Spoel, Ph.D., Professor of Biology
Molec. Biophys. group, Dept. of Cell & Molec. Biol., Uppsala University.
Box 596, 75124 Uppsala, Sweden. Phone: +46184714205. Fax: +4618511755.
spoel at xray.bmc.uu.se spoel at gromacs.org http://folding.bmc.uu.se
More information about the gromacs.org_gmx-users
mailing list