[gmx-developers] Possible regression in gromacs 4.6
David van der Spoel
spoel at xray.bmc.uu.se
Thu Jun 21 18:32:04 CEST 2012
On 2012-06-21 18:22, Alexey Shvetsov wrote:
> Hi all!
>
> After merging commit
> commit 5ba7125c5972f2aafde2310eaa4a345cbac55da5
> Author: Erik Lindahl <erik at kth.se>
> Date: Mon May 28 20:54:17 2012 +0200
>
> New CPU detection & AVX/SSE code, removed raw assembly files.
If you checked the installation output, you would have been warned :).
Don't worry in a couple of days it will be fixed.
>
> I noticed regression in gromacs speed. I used two systems for tests one
> 7bna and second speptide froma examples
>
> For 7bna system old 4.6 version 4.6-dev-20120418-3759a-dirty-unknown gives
> R E A L C Y C L E A N D T I M E A C C O U N T I N G
>
> Computing: Nodes Number G-Cycles Seconds %
> -----------------------------------------------------------------------
> Domain decomp. 16 5000 502.025 335.5 1.5
> DD comm. load 16 5000 4.309 2.9 0.0
> DD comm. bounds 16 5000 13.941 9.3 0.0
> Comm. coord. 16 50001 497.769 332.7 1.5
> Neighbor search 16 5001 1630.241 1089.6 4.8
> Force 16 50001 23079.690 15425.1 67.3
> Wait + Comm. F 16 50001 618.862 413.6 1.8
> PME mesh 16 50001 6564.978 4387.7 19.1
> Write traj. 16 101 16.666 11.1 0.0
> Update 16 50001 384.280 256.8 1.1
> Constraints 16 50001 592.154 395.8 1.7
> Comm. energies 16 5001 125.537 83.9 0.4
> Rest 16 256.227 171.2 0.7
> -----------------------------------------------------------------------
> Total 16 34286.680 22915.3 100.0
> -----------------------------------------------------------------------
> -----------------------------------------------------------------------
> PME redist. X/F 16 100002 1176.273 786.2 3.4
> PME spread/gather 16 100002 2119.858 1416.8 6.2
> PME 3D-FFT 16 100002 1041.014 695.8 3.0
> PME 3D-FFT Comm. 16 200004 1905.967 1273.8 5.6
> PME solve 16 50001 316.714 211.7 0.9
> -----------------------------------------------------------------------
>
> Parallel run - timing based on wallclock.
>
> NODE (s) Real (s) (%)
> Time: 716.102 716.102 100.0
> 11:56
> (Mnbf/s) (GFlops) (ns/day) (hour/ns)
> Performance: 1482.789 73.686 12.066 1.989
>
> New version 4.6-dev-20120618-283a0e5-dirty-unknown with sse4.1
> acceleration enabled gives only
> R E A L C Y C L E A N D T I M E A C C O U N T I N G
>
> Computing: Nodes Number G-Cycles Seconds %
> -----------------------------------------------------------------------
> Domain decomp. 16 5000 503.648 336.6 0.5
> DD comm. load 16 5000 5.666 3.8 0.0
> DD comm. bounds 16 5000 11.637 7.8 0.0
> Comm. coord. 16 50001 480.473 321.1 0.4
> Neighbor search 16 5001 1665.565 1113.2 1.5
> Force 16 50001 98860.466 66073.0 89.0
> Wait + Comm. F 16 50001 608.138 406.4 0.5
> PME mesh 16 50001 7605.687 5083.2 6.8
> Write traj. 16 103 17.010 11.4 0.0
> Update 16 50001 383.590 256.4 0.3
> Constraints 16 50001 582.954 389.6 0.5
> Comm. energies 16 5001 132.665 88.7 0.1
> Rest 16 257.063 171.8 0.2
> -----------------------------------------------------------------------
> Total 16 111114.560 74263.0 100.0
> -----------------------------------------------------------------------
> -----------------------------------------------------------------------
> PME redist. X/F 16 100002 2258.309 1509.3 2.0
> PME spread/gather 16 100002 2111.979 1411.5 1.9
> PME 3D-FFT 16 100002 1046.271 699.3 0.9
> PME 3D-FFT Comm. 16 200004 1854.221 1239.3 1.7
> PME solve 16 50001 329.985 220.5 0.3
> -----------------------------------------------------------------------
>
> Parallel run - timing based on wallclock.
>
> NODE (s) Real (s) (%)
> Time: 2320.719 2320.719 100.0
> 38:40
> (Mnbf/s) (GFlops) (ns/day) (hour/ns)
> Performance: 457.569 22.739 3.723 6.446
>
>
--
David van der Spoel, Ph.D., Professor of Biology
Dept. of Cell & Molec. Biol., Uppsala University.
Box 596, 75124 Uppsala, Sweden. Phone: +46184714205.
spoel at xray.bmc.uu.se http://folding.bmc.uu.se
More information about the gromacs.org_gmx-developers
mailing list