[gmx-developers] Gromacs on 48 core magny-cours AMDs
Sander Pronk
pronk at cbr.su.se
Thu Sep 1 09:19:35 CEST 2011
On 31 Aug 2011, at 22:10 , Igor Leontyev wrote:
> Hi
> I am benchmarking a 100K atom system (protein ~12K and solvent ~90K atoms, 1 fs time step, cutoffs 1.2 nm) on a 48-core 2.1 GHz AMD node. Software: Gromacs 4.5.4; compiled by gcc4.4.6; CentOS 5.6 kernel 2.6.18-238.19.1.el5. See the results of g_tune_pme bellow. The performance is absolutely unstable, the computation time for equivalent runs can differ by orders of magnitude.
>
> The issue seems to be similar to what has been discussed earlier
> http://lists.gromacs.org/pipermail/gmx-users/2010-October/055113.html
> Is there any progress in resolving it?
That's an old kernel. If I remember correctly, that thread discussed issues related to thread&process affinity and NUMA-awareness on older kernels.
Perhaps you could try a newer kernel?
>
> Igor
>
>
> ------------------------------------------------------------
>
> P E R F O R M A N C E R E S U L T S
>
> ------------------------------------------------------------
> g_tune_pme for Gromacs VERSION 4.5.4
> Number of nodes : 48
> The mpirun command is : /home/leontyev/programs/bin/mpi/openmpi/openmpi-1.4.3/bin/mpirun --hostfile node_loading.txt
> Passing # of nodes via : -np
> The mdrun command is : /home/leontyev/programs/bin/gromacs/gromacs-4.5.4/bin/mdrun_mpich1.4.3
> mdrun args benchmarks : -resetstep 100 -o bench.trr -x bench.xtc -cpo bench.cpt -c bench.gro -e bench.edr -g bench.log
> Benchmark steps : 1000
> dlb equilibration steps : 100
> Repeats for each test : 10
> Input file : cco_PM_ff03_sorin_scaled_meanpol.tpr
> Coulomb type : PME
> Grid spacing x y z : 0.114376 0.116700 0.116215
> Van der Waals type : Cut-off
>
> Will try these real/reciprocal workload settings:
> No. scaling rcoulomb nkx nky nkz spacing rvdw tpr file
> 0 -input- 1.200000 72 80 112 0.116700 1.200000 cco_PM_ff03_sorin_scaled_meanpol_bench00.tpr
>
> Individual timings for input file 0 (cco_PM_ff03_sorin_scaled_meanpol_bench00.tpr):
> PME nodes Gcycles ns/day PME/f Remark
> 24 3185.840 2.734 0.538 OK.
> 24 7237.416 1.203 1.119 OK.
> 24 3225.448 2.700 0.546 OK.
> 24 5844.942 1.489 1.012 OK.
> 24 4013.986 2.169 0.552 OK.
> 24 18578.174 0.469 0.842 OK.
> 24 3234.702 2.692 0.559 OK.
> 24 25818.267 0.337 0.815 OK.
> 24 32470.278 0.268 0.479 OK.
> 24 3234.806 2.692 0.561 OK.
> 23 15097.577 0.577 0.824 OK.
> 23 2948.211 2.954 0.705 OK.
> 23 15640.485 0.557 0.826 OK.
> 23 66961.240 0.130 3.215 OK.
> 23 2964.927 2.938 0.698 OK.
> 23 2965.896 2.937 0.669 OK.
> 23 11205.121 0.774 0.668 OK.
> 23 2964.737 2.938 0.672 OK.
> 23 13384.753 0.649 0.665 OK.
> 23 3738.425 2.329 0.738 OK.
> 22 3130.744 2.782 0.682 OK.
> 22 3981.770 2.187 0.659 OK.
> 22 6397.259 1.350 0.666 OK.
> 22 41374.579 0.211 3.509 OK.
> 22 3193.327 2.728 0.683 OK.
> 22 21405.007 0.407 0.871 OK.
> 22 3543.511 2.457 0.686 OK.
> 22 3539.981 2.460 0.701 OK.
> 22 30946.123 0.281 1.235 OK.
> 22 18031.023 0.483 0.729 OK.
> 21 2978.520 2.924 0.699 OK.
> 21 4487.921 1.940 0.666 OK.
> 21 39796.932 0.219 1.085 OK.
> 21 3027.659 2.877 0.714 OK.
> 21 58613.050 0.149 1.089 OK.
> 21 2973.281 2.929 0.698 OK.
> 21 34991.505 0.249 0.702 OK.
> 21 4479.034 1.944 0.696 OK.
> 21 40401.894 0.216 1.310 OK.
> 21 63325.943 0.138 1.124 OK.
> 20 17100.304 0.510 0.620 OK.
> 20 2859.158 3.047 0.832 OK.
> 20 2660.459 3.274 0.820 OK.
> 20 2871.060 3.034 0.821 OK.
> 20 105947.063 0.082 0.728 OK.
> 20 2851.650 3.055 0.827 OK.
> 20 2766.737 3.149 0.837 OK.
> 20 13887.535 0.627 0.813 OK.
> 20 9450.158 0.919 0.854 OK.
> 20 2983.460 2.920 0.838 OK.
> 19 0.000 0.000 - No DD grid found for these settings.
> 18 62490.241 0.139 1.070 OK.
> 18 75625.947 0.115 0.512 OK.
> 18 3584.509 2.430 1.176 OK.
> 18 4988.745 1.734 1.197 OK.
> 18 92981.804 0.094 0.529 OK.
> 18 3070.496 2.837 1.192 OK.
> 18 3089.339 2.820 1.204 OK.
> 18 5880.675 1.465 1.170 OK.
> 18 3094.133 2.816 1.214 OK.
> 18 3573.552 2.437 1.191 OK.
> 17 0.000 0.000 - No DD grid found for these settings.
> 16 3105.597 2.805 0.998 OK.
> 16 2719.826 3.203 1.045 OK.
> 16 3124.013 2.788 0.992 OK.
> 16 2708.751 3.216 1.030 OK.
> 16 3116.887 2.795 1.023 OK.
> 16 2695.859 3.232 1.038 OK.
> 16 2710.272 3.215 1.033 OK.
> 16 32639.259 0.267 0.514 OK.
> 16 56748.577 0.153 0.959 OK.
> 16 32362.192 0.269 1.816 OK.
> 15 40410.983 0.216 1.241 OK.
> 15 3727.108 2.337 1.262 OK.
> 15 3297.944 2.642 1.242 OK.
> 15 23012.201 0.379 0.994 OK.
> 15 3328.307 2.618 1.248 OK.
> 15 56869.719 0.153 0.568 OK.
> 15 26662.044 0.327 0.854 OK.
> 15 44026.837 0.198 1.198 OK.
> 15 3754.812 2.320 1.238 OK.
> 15 68683.967 0.127 0.844 OK.
> 14 2934.532 2.969 1.466 OK.
> 14 2824.434 3.085 1.430 OK.
> 14 2778.103 3.137 1.391 OK.
> 14 28435.548 0.306 0.957 OK.
> 14 2876.113 3.030 1.396 OK.
> 14 2803.951 3.108 1.438 OK.
> 14 9538.366 0.913 1.400 OK.
> 14 2887.242 3.018 1.424 OK.
> 14 32542.115 0.268 0.529 OK.
> 14 14256.539 0.609 1.432 OK.
> 13 5010.011 1.732 1.768 OK.
> 13 19270.893 0.452 1.481 OK.
> 13 3451.426 2.525 1.860 OK.
> 13 28566.186 0.305 0.620 OK.
> 13 3481.006 2.504 1.833 OK.
> 13 28457.876 0.306 0.933 OK.
> 13 3689.128 2.362 1.795 OK.
> 13 3451.925 2.525 1.831 OK.
> 13 34918.063 0.249 1.838 OK.
> 13 3473.566 2.509 1.854 OK.
> 12 42705.256 0.204 1.039 OK.
> 12 4934.453 1.763 1.292 OK.
> 12 16759.163 0.520 1.288 OK.
> 12 27660.618 0.315 0.855 OK.
> 12 6293.874 1.380 1.263 OK.
> 12 40502.818 0.215 1.284 OK.
> 12 31595.114 0.276 0.615 OK.
> 12 61936.825 0.140 0.612 OK.
> 12 3013.850 2.891 1.345 OK.
> 12 3840.023 2.269 1.310 OK.
> 0 2628.156 3.317 - OK.
> 0 2573.649 3.387 - OK.
> 0 95523.769 0.091 - OK.
> 0 2594.895 3.360 - OK.
> 0 2614.131 3.335 - OK.
> 0 2610.647 3.339 - OK.
> 0 2560.067 3.405 - OK.
> 0 2609.485 3.341 - OK.
> 0 2603.154 3.349 - OK.
> 0 2583.289 3.375 - OK.
> -1( 16) 2672.797 3.260 1.002 OK.
> -1( 16) 57769.149 0.151 1.723 OK.
> -1( 16) 48598.334 0.179 1.138 OK.
> -1( 16) 2699.333 3.228 1.040 OK.
> -1( 16) 54243.321 0.161 1.679 OK.
> -1( 16) 2719.854 3.203 1.051 OK.
> -1( 16) 2716.365 3.207 1.051 OK.
> -1( 16) 24278.608 0.359 0.835 OK.
> -1( 16) 19357.359 0.449 1.006 OK.
> -1( 16) 45500.360 0.191 0.795 OK.
>
> Tuning took 500.5 minutes.
>
> ------------------------------------------------------------
> Summary of successful runs:
> Line tpr PME nodes Gcycles Av. Std.dev. ns/day PME/f DD grid
> 0 0 24 10684.386 10896.612 1.675 0.702 3 4 2
> 1 0 23 13787.137 19462.982 1.678 0.968 1 5 5
> 2 0 22 13554.332 13814.153 1.535 1.042 2 13 1
> 3 0 21 25507.574 24601.033 1.358 0.878 3 3 3
> 4 0 20 16337.758 31934.533 2.062 0.799 2 2 7
> 5 0 18 25837.944 36067.176 1.689 1.045 3 2 5
> 6 0 16 14193.123 19370.807 2.194 1.045 4 4 2
> 7 0 15 27377.392 24308.700 1.132 1.069 3 11 1
> 8 0 14 10187.694 11414.829 2.044 1.286 1 2 17
> 9 0 13 13377.008 12969.168 1.547 1.581 1 5 7
> 10 0 12 23924.199 20299.796 0.997 1.090 3 4 3
> 11 0 0 11890.124 29385.874 3.030 - 6 4 2
> 12 0 -1( 16) 26055.548 23371.735 1.439 1.132 4 4 2
> --
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-developers-request at gromacs.org.
More information about the gromacs.org_gmx-developers
mailing list