[gmx-developers] Possible regression in gromacs 4.6
Alexey Shvetsov
alexxy at omrb.pnpi.spb.ru
Thu Jun 21 18:37:58 CEST 2012
Hi!
Oh.. I didnt notoiced this =\ Well anyway its just test systems
Roland Schulz писал 2012-06-21 20:31:
> Hi,
>
> you should have gotten:
>
> * WARNING * WARNING * WARNING * WARNING * WARNING * WARNING *
> We have just committed the new CPU detection code in this branch,
> and will commit new SSE/AVX kernels in a few days. However, this
> means that currently only the NxN kernels are accelerated!
> In the mean time, you might want to avoid production runs in 4.6.
>
> Roland
>
> On Thu, Jun 21, 2012 at 12:22 PM, Alexey Shvetsov
> <alexxy at omrb.pnpi.spb.ru> wrote:
>
>> Hi all!
>>
>> After merging commit
>> commit 5ba7125c5972f2aafde2310eaa4a345cbac55da5
>> Author: Erik Lindahl <erik at kth.se>
>> Date: Mon May 28 20:54:17 2012 +0200
>>
>> New CPU detection & AVX/SSE code, removed raw assembly files.
>>
>> I noticed regression in gromacs speed. I used two systems for tests
>> one
>> 7bna and second speptide froma examples
>>
>> For 7bna system old 4.6 version 4.6-dev-20120418-3759a-dirty-unknown
>> gives
>> R E A L C Y C L E A N D T I M E A C C O U N T I N G
>>
>> Computing: Nodes Number G-Cycles Seconds %
>>
>> -----------------------------------------------------------------------
>> Domain decomp. 16 5000 502.025 335.5 1.5
>> DD comm. load 16 5000 4.309 2.9 0.0
>> DD comm. bounds 16 5000 13.941 9.3 0.0
>> Comm. coord. 16 50001 497.769 332.7 1.5
>> Neighbor search 16 5001 1630.241 1089.6 4.8
>> Force 16 50001 23079.690 15425.1 67.3
>> Wait + Comm. F 16 50001 618.862 413.6 1.8
>> PME mesh 16 50001 6564.978 4387.7 19.1
>> Write traj. 16 101 16.666 11.1 0.0
>> Update 16 50001 384.280 256.8 1.1
>> Constraints 16 50001 592.154 395.8 1.7
>> Comm. energies 16 5001 125.537 83.9 0.4
>> Rest 16 256.227 171.2 0.7
>>
>> -----------------------------------------------------------------------
>> Total 16 34286.680 22915.3 100.0
>>
>> -----------------------------------------------------------------------
>>
>> -----------------------------------------------------------------------
>> PME redist. X/F 16 100002 1176.273 786.2 3.4
>> PME spread/gather 16 100002 2119.858 1416.8 6.2
>> PME 3D-FFT 16 100002 1041.014 695.8 3.0
>> PME 3D-FFT Comm. 16 200004 1905.967 1273.8 5.6
>> PME solve 16 50001 316.714 211.7 0.9
>>
>> -----------------------------------------------------------------------
>>
>> Parallel run - timing based on wallclock.
>>
>> NODE (s) Real (s) (%)
>> Time: 716.102 716.102 100.0
>> 11:56
>> (Mnbf/s) (GFlops) (ns/day) (hour/ns)
>> Performance: 1482.789 73.686 12.066 1.989
>>
>> New version 4.6-dev-20120618-283a0e5-dirty-unknown with sse4.1
>> acceleration enabled gives only
>> R E A L C Y C L E A N D T I M E A C C O U N T I N G
>>
>> Computing: Nodes Number G-Cycles Seconds %
>>
>> -----------------------------------------------------------------------
>> Domain decomp. 16 5000 503.648 336.6 0.5
>> DD comm. load 16 5000 5.666 3.8 0.0
>> DD comm. bounds 16 5000 11.637 7.8 0.0
>> Comm. coord. 16 50001 480.473 321.1 0.4
>> Neighbor search 16 5001 1665.565 1113.2 1.5
>> Force 16 50001 98860.466 66073.0 89.0
>> Wait + Comm. F 16 50001 608.138 406.4 0.5
>> PME mesh 16 50001 7605.687 5083.2 6.8
>> Write traj. 16 103 17.010 11.4 0.0
>> Update 16 50001 383.590 256.4 0.3
>> Constraints 16 50001 582.954 389.6 0.5
>> Comm. energies 16 5001 132.665 88.7 0.1
>> Rest 16 257.063 171.8 0.2
>>
>> -----------------------------------------------------------------------
>> Total 16 111114.560 74263.0 100.0
>>
>> -----------------------------------------------------------------------
>>
>> -----------------------------------------------------------------------
>> PME redist. X/F 16 100002 2258.309 1509.3 2.0
>> PME spread/gather 16 100002 2111.979 1411.5 1.9
>> PME 3D-FFT 16 100002 1046.271 699.3 0.9
>> PME 3D-FFT Comm. 16 200004 1854.221 1239.3 1.7
>> PME solve 16 50001 329.985 220.5 0.3
>>
>> -----------------------------------------------------------------------
>>
>> Parallel run - timing based on wallclock.
>>
>> NODE (s) Real (s) (%)
>> Time: 2320.719 2320.719 100.0
>> 38:40
>> (Mnbf/s) (GFlops) (ns/day) (hour/ns)
>> Performance: 457.569 22.739 3.723 6.446
>>
>> --
>> Best Regards,
>> Alexey Alexxy Shvetsov
>> Petersburg Nuclear Physics Institute, NRC Kurchatov Institute,
>> Gatchina, Russia
>> Department of Molecular and Radiation Biophysics
>> Gentoo Team Ru
>> Gentoo Linux Dev
>> mailto:alexxyum at gmail.com
>> mailto:alexxy at gentoo.org
>> mailto:alexxy at omrb.pnpi.spb.ru
>> --
>> gmx-developers mailing list
>> gmx-developers at gromacs.org
>> http://lists.gromacs.org/mailman/listinfo/gmx-developers [2]
>> Please dont post (un)subscribe requests to the list. Use the
>> www interface or send it to gmx-developers-request at gromacs.org.
>
> --
> ORNL/UT Center for Molecular Biophysics cmb.ornl.gov [3]
> 865-241-1537, ORNL PO BOX 2008 MS6309
>
>
> Links:
> ------
> [1] http://lists.gromacs.org/mailman/listinfo/gmx-developers
> [2] http://cmb.ornl.gov
--
Best Regards,
Alexey 'Alexxy' Shvetsov
Petersburg Nuclear Physics Institute, NRC Kurchatov Institute,
Gatchina, Russia
Department of Molecular and Radiation Biophysics
Gentoo Team Ru
Gentoo Linux Dev
mailto:alexxyum at gmail.com
mailto:alexxy at gentoo.org
mailto:alexxy at omrb.pnpi.spb.ru
More information about the gromacs.org_gmx-developers
mailing list