[gmx-developers] Possible regression in gromacs 4.6

Roland Schulz roland at utk.edu
Thu Jun 21 18:31:26 CEST 2012


Hi,

you should have gotten:
* WARNING * WARNING * WARNING * WARNING * WARNING * WARNING *
We have just committed the new CPU detection code in this branch,
and will commit new SSE/AVX kernels in a few days. However, this
means that currently only the NxN kernels are accelerated!
In the mean time, you might want to avoid production runs in 4.6.

Roland

On Thu, Jun 21, 2012 at 12:22 PM, Alexey Shvetsov
<alexxy at omrb.pnpi.spb.ru>wrote:

> Hi all!
>
> After merging commit
> commit 5ba7125c5972f2aafde2310eaa4a345cbac55da5
> Author: Erik Lindahl <erik at kth.se>
> Date:   Mon May 28 20:54:17 2012 +0200
>
>     New CPU detection & AVX/SSE code, removed raw assembly files.
>
> I noticed regression in gromacs speed. I used two systems for tests one
> 7bna and second speptide froma examples
>
> For 7bna system old 4.6 version 4.6-dev-20120418-3759a-dirty-unknown
> gives
>      R E A L   C Y C L E   A N D   T I M E   A C C O U N T I N G
>
>  Computing:         Nodes     Number     G-Cycles    Seconds     %
> -----------------------------------------------------------------------
>  Domain decomp.        16       5000      502.025      335.5     1.5
>  DD comm. load         16       5000        4.309        2.9     0.0
>  DD comm. bounds       16       5000       13.941        9.3     0.0
>  Comm. coord.          16      50001      497.769      332.7     1.5
>  Neighbor search       16       5001     1630.241     1089.6     4.8
>  Force                 16      50001    23079.690    15425.1    67.3
>  Wait + Comm. F        16      50001      618.862      413.6     1.8
>  PME mesh              16      50001     6564.978     4387.7    19.1
>  Write traj.           16        101       16.666       11.1     0.0
>  Update                16      50001      384.280      256.8     1.1
>  Constraints           16      50001      592.154      395.8     1.7
>  Comm. energies        16       5001      125.537       83.9     0.4
>  Rest                  16                 256.227      171.2     0.7
> -----------------------------------------------------------------------
>  Total                 16               34286.680    22915.3   100.0
> -----------------------------------------------------------------------
> -----------------------------------------------------------------------
>  PME redist. X/F       16     100002     1176.273      786.2     3.4
>  PME spread/gather     16     100002     2119.858     1416.8     6.2
>  PME 3D-FFT            16     100002     1041.014      695.8     3.0
>  PME 3D-FFT Comm.      16     200004     1905.967     1273.8     5.6
>  PME solve             16      50001      316.714      211.7     0.9
> -----------------------------------------------------------------------
>
>         Parallel run - timing based on wallclock.
>
>                NODE (s)   Real (s)      (%)
>        Time:    716.102    716.102    100.0
>                        11:56
>                (Mnbf/s)   (GFlops)   (ns/day)  (hour/ns)
> Performance:   1482.789     73.686     12.066      1.989
>
> New version 4.6-dev-20120618-283a0e5-dirty-unknown with sse4.1
> acceleration enabled gives only
>      R E A L   C Y C L E   A N D   T I M E   A C C O U N T I N G
>
>  Computing:         Nodes     Number     G-Cycles    Seconds     %
> -----------------------------------------------------------------------
>  Domain decomp.        16       5000      503.648      336.6     0.5
>  DD comm. load         16       5000        5.666        3.8     0.0
>  DD comm. bounds       16       5000       11.637        7.8     0.0
>  Comm. coord.          16      50001      480.473      321.1     0.4
>  Neighbor search       16       5001     1665.565     1113.2     1.5
>  Force                 16      50001    98860.466    66073.0    89.0
>  Wait + Comm. F        16      50001      608.138      406.4     0.5
>  PME mesh              16      50001     7605.687     5083.2     6.8
>  Write traj.           16        103       17.010       11.4     0.0
>  Update                16      50001      383.590      256.4     0.3
>  Constraints           16      50001      582.954      389.6     0.5
>  Comm. energies        16       5001      132.665       88.7     0.1
>  Rest                  16                 257.063      171.8     0.2
> -----------------------------------------------------------------------
>  Total                 16              111114.560    74263.0   100.0
> -----------------------------------------------------------------------
> -----------------------------------------------------------------------
>  PME redist. X/F       16     100002     2258.309     1509.3     2.0
>  PME spread/gather     16     100002     2111.979     1411.5     1.9
>  PME 3D-FFT            16     100002     1046.271      699.3     0.9
>  PME 3D-FFT Comm.      16     200004     1854.221     1239.3     1.7
>  PME solve             16      50001      329.985      220.5     0.3
> -----------------------------------------------------------------------
>
>         Parallel run - timing based on wallclock.
>
>                NODE (s)   Real (s)      (%)
>        Time:   2320.719   2320.719    100.0
>                        38:40
>                (Mnbf/s)   (GFlops)   (ns/day)  (hour/ns)
> Performance:    457.569     22.739      3.723      6.446
>
>
> --
> Best Regards,
> Alexey 'Alexxy' Shvetsov
> Petersburg Nuclear Physics Institute, NRC Kurchatov Institute,
> Gatchina, Russia
> Department of Molecular and Radiation Biophysics
> Gentoo Team Ru
> Gentoo Linux Dev
> mailto:alexxyum at gmail.com
> mailto:alexxy at gentoo.org
> mailto:alexxy at omrb.pnpi.spb.ru
> --
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-developers-request at gromacs.org.
>
>
>
>
>


-- 
ORNL/UT Center for Molecular Biophysics cmb.ornl.gov
865-241-1537, ORNL PO BOX 2008 MS6309
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-developers/attachments/20120621/10e4afdb/attachment.html>


More information about the gromacs.org_gmx-developers mailing list