[gmx-developers] Possible regression in gromacs 4.6

Alexey Shvetsov alexxy at omrb.pnpi.spb.ru
Thu Jun 21 18:37:58 CEST 2012


Hi!

Oh.. I didnt notoiced this =\ Well anyway its just test systems

Roland Schulz писал 2012-06-21 20:31:
> Hi,
>
> you should have gotten:
>
> * WARNING * WARNING * WARNING * WARNING * WARNING * WARNING *
> We have just committed the new CPU detection code in this branch,
> and will commit new SSE/AVX kernels in a few days. However, this
> means that currently only the NxN kernels are accelerated!
> In the mean time, you might want to avoid production runs in 4.6.
>
> Roland
>
> On Thu, Jun 21, 2012 at 12:22 PM, Alexey Shvetsov
> <alexxy at omrb.pnpi.spb.ru> wrote:
>
>> Hi all!
>>
>> After merging commit
>> commit 5ba7125c5972f2aafde2310eaa4a345cbac55da5
>> Author: Erik Lindahl <erik at kth.se>
>> Date:   Mon May 28 20:54:17 2012 +0200
>>
>>     New CPU detection & AVX/SSE code, removed raw assembly files.
>>
>> I noticed regression in gromacs speed. I used two systems for tests 
>> one
>> 7bna and second speptide froma examples
>>
>> For 7bna system old 4.6 version 4.6-dev-20120418-3759a-dirty-unknown
>> gives
>>      R E A L   C Y C L E   A N D   T I M E   A C C O U N T I N G
>>
>>  Computing:         Nodes     Number     G-Cycles    Seconds     %
>> 
>> -----------------------------------------------------------------------
>>  Domain decomp.        16       5000      502.025      335.5     1.5
>>  DD comm. load         16       5000        4.309        2.9     0.0
>>  DD comm. bounds       16       5000       13.941        9.3     0.0
>>  Comm. coord.          16      50001      497.769      332.7     1.5
>>  Neighbor search       16       5001     1630.241     1089.6     4.8
>>  Force                 16      50001    23079.690    15425.1    67.3
>>  Wait + Comm. F        16      50001      618.862      413.6     1.8
>>  PME mesh              16      50001     6564.978     4387.7    19.1
>>  Write traj.           16        101       16.666       11.1     0.0
>>  Update                16      50001      384.280      256.8     1.1
>>  Constraints           16      50001      592.154      395.8     1.7
>>  Comm. energies        16       5001      125.537       83.9     0.4
>>  Rest                  16                 256.227      171.2     0.7
>> 
>> -----------------------------------------------------------------------
>>  Total                 16               34286.680    22915.3   100.0
>> 
>> -----------------------------------------------------------------------
>> 
>> -----------------------------------------------------------------------
>>  PME redist. X/F       16     100002     1176.273      786.2     3.4
>>  PME spread/gather     16     100002     2119.858     1416.8     6.2
>>  PME 3D-FFT            16     100002     1041.014      695.8     3.0
>>  PME 3D-FFT Comm.      16     200004     1905.967     1273.8     5.6
>>  PME solve             16      50001      316.714      211.7     0.9
>> 
>> -----------------------------------------------------------------------
>>
>>         Parallel run - timing based on wallclock.
>>
>>                NODE (s)   Real (s)      (%)
>>        Time:    716.102    716.102    100.0
>>                        11:56
>>                (Mnbf/s)   (GFlops)   (ns/day)  (hour/ns)
>> Performance:   1482.789     73.686     12.066      1.989
>>
>> New version 4.6-dev-20120618-283a0e5-dirty-unknown with sse4.1
>> acceleration enabled gives only
>>      R E A L   C Y C L E   A N D   T I M E   A C C O U N T I N G
>>
>>  Computing:         Nodes     Number     G-Cycles    Seconds     %
>> 
>> -----------------------------------------------------------------------
>>  Domain decomp.        16       5000      503.648      336.6     0.5
>>  DD comm. load         16       5000        5.666        3.8     0.0
>>  DD comm. bounds       16       5000       11.637        7.8     0.0
>>  Comm. coord.          16      50001      480.473      321.1     0.4
>>  Neighbor search       16       5001     1665.565     1113.2     1.5
>>  Force                 16      50001    98860.466    66073.0    89.0
>>  Wait + Comm. F        16      50001      608.138      406.4     0.5
>>  PME mesh              16      50001     7605.687     5083.2     6.8
>>  Write traj.           16        103       17.010       11.4     0.0
>>  Update                16      50001      383.590      256.4     0.3
>>  Constraints           16      50001      582.954      389.6     0.5
>>  Comm. energies        16       5001      132.665       88.7     0.1
>>  Rest                  16                 257.063      171.8     0.2
>> 
>> -----------------------------------------------------------------------
>>  Total                 16              111114.560    74263.0   100.0
>> 
>> -----------------------------------------------------------------------
>> 
>> -----------------------------------------------------------------------
>>  PME redist. X/F       16     100002     2258.309     1509.3     2.0
>>  PME spread/gather     16     100002     2111.979     1411.5     1.9
>>  PME 3D-FFT            16     100002     1046.271      699.3     0.9
>>  PME 3D-FFT Comm.      16     200004     1854.221     1239.3     1.7
>>  PME solve             16      50001      329.985      220.5     0.3
>> 
>> -----------------------------------------------------------------------
>>
>>         Parallel run - timing based on wallclock.
>>
>>                NODE (s)   Real (s)      (%)
>>        Time:   2320.719   2320.719    100.0
>>                        38:40
>>                (Mnbf/s)   (GFlops)   (ns/day)  (hour/ns)
>> Performance:    457.569     22.739      3.723      6.446
>>
>> --
>> Best Regards,
>> Alexey Alexxy Shvetsov
>> Petersburg Nuclear Physics Institute, NRC Kurchatov Institute,
>> Gatchina, Russia
>> Department of Molecular and Radiation Biophysics
>> Gentoo Team Ru
>> Gentoo Linux Dev
>> mailto:alexxyum at gmail.com
>> mailto:alexxy at gentoo.org
>> mailto:alexxy at omrb.pnpi.spb.ru
>> --
>> gmx-developers mailing list
>> gmx-developers at gromacs.org
>> http://lists.gromacs.org/mailman/listinfo/gmx-developers [2]
>> Please dont post (un)subscribe requests to the list. Use the
>> www interface or send it to gmx-developers-request at gromacs.org.
>
> --
> ORNL/UT Center for Molecular Biophysics cmb.ornl.gov [3]
> 865-241-1537, ORNL PO BOX 2008 MS6309
>
>
> Links:
> ------
> [1] http://lists.gromacs.org/mailman/listinfo/gmx-developers
> [2] http://cmb.ornl.gov

-- 
Best Regards,
Alexey 'Alexxy' Shvetsov
Petersburg Nuclear Physics Institute, NRC Kurchatov Institute, 
Gatchina, Russia
Department of Molecular and Radiation Biophysics
Gentoo Team Ru
Gentoo Linux Dev
mailto:alexxyum at gmail.com
mailto:alexxy at gentoo.org
mailto:alexxy at omrb.pnpi.spb.ru



More information about the gromacs.org_gmx-developers mailing list