[gmx-users] FEP and loss of performance

Luca Bellucci lcbllcc at gmail.com
Mon Apr 4 16:56:37 CEST 2011


Hi Chris,
thank for the suggestions,
in the previous mail there is a mistake because   
couple-moltype = SOL (for solvent) and not "Protein_chaim_P".
Now the problem of the load balance seems reasonable, because
the water box is large ~9.0 nm.
However the problem exist and the performance loss is very high, so I have 
redone calculations with this command:

grompp -f 
md.mdp -c ../Run-02/confout.gro -t ../Run-02/state.cpt -p ../topo.top -n ../index.ndx -o 
md.tpr -maxwarn 1

mdrun -s md.tpr -o md

this is part of the md.mdp file: 

; Run parameters
; define          = -DPOSRES
integrator	= md		; 
nsteps		= 1000 	; 
dt		= 0.002		; 
[..]
free_energy    = yes     ; /no
init_lambda    = 0.9    
delta_lambda   = 0.0
couple-moltype = SOL    ; solvent water
couple-lambda0 = vdw-q
couple-lambda1 = none
couple-intramol= yes

Result for free energy calculation  
 Computing:         Nodes     Number     G-Cycles    Seconds     %
-----------------------------------------------------------------------
 Domain decomp.       8        126       22.050        8.3     0.1
 DD comm. load          8         15        0.009        0.0     0.0
 DD comm. bounds     8         12        0.031        0.0     0.0
 Comm. coord.            8       1001       17.319        6.5     0.0
 Neighbor search        8        127      436.569      163.7     1.1
 Force                           8       1001    34241.576    12840.9    87.8
 Wait + Comm. F        8       1001       19.486        7.3     0.0
 PME mesh                  8       1001     4190.758     1571.6    10.7
 Write traj.                  8          7        1.827        0.7     0.0
 Update                      8       1001       12.557        4.7     0.0
 Constraints               8       1001       26.496        9.9     0.1
 Comm. energies      8       1002       10.710        4.0     0.0
 Rest                   8                  25.142        9.4     0.1
-----------------------------------------------------------------------
 Total                  8               39004.531    14627.1   100.0
-----------------------------------------------------------------------
-----------------------------------------------------------------------
 PME redist. X/F          8       3003     3479.771     1304.9     8.9
 PME spread/gather   8       4004      277.574      104.1     0.7
 PME 3D-FFT               8       4004      378.090      141.8     1.0
 PME solve                  8       2002       55.033       20.6     0.1
-----------------------------------------------------------------------
	Parallel run - timing based on wallclock.

               NODE (s)   Real (s)      (%)
       Time:   1828.385   1828.385    100.0
                       30:28
                             (Mnbf/s)   (GFlops)   (ns/day)  (hour/ns)
Performance:      3.115      3.223      0.095    253.689

 I Switched off only the free_energy keyword and I redone the calculation 
I have:
 Computing:         Nodes     Number     G-Cycles    Seconds     %
-----------------------------------------------------------------------
 Domain decomp.      8         77       10.975        4.1     0.6
 DD comm. load         8          1        0.001        0.0     0.0
 Comm. coord.           8       1001       14.480        5.4     0.8
 Neighbor search       8         78      136.479       51.2     7.3
 Force                         8       1001     1141.115      427.9    61.3
 Wait + Comm. F      8       1001       17.845        6.7     1.0
 PME mesh                8       1001      484.581      181.7    26.0
 Write traj.               8          5        1.221        0.5     0.1
 Update                   8       1001        9.976        3.7     0.5
 Constraints            8       1001       20.275        7.6     1.1
 Comm. energies     8        992        5.933        2.2     0.3
 Rest                         8                  19.670        7.4     1.1
-----------------------------------------------------------------------
 Total                  8                1862.552      698.5   100.0
-----------------------------------------------------------------------
-----------------------------------------------------------------------
 PME redist. X/F        8       2002       92.204       34.6     5.0
 PME spread/gather      8       2002      192.337       72.1    10.3
 PME 3D-FFT             8       2002      177.373       66.5     9.5
 PME solve              8       1001       22.512        8.4     1.2
-----------------------------------------------------------------------
	Parallel run - timing based on wallclock.

               NODE (s)   Real (s)      (%)
       Time:     87.309     87.309    100.0
                       1:27
                         (Mnbf/s)   (GFlops)   (ns/day)  (hour/ns)
Performance:    439.731     23.995      1.981     12.114
Finished mdrun on node 0 Mon Apr  4 16:52:04 2011

Luca	




> If we accept your text at face value, then the simulation slowed down
> by a factor of 1500%, certainly not the 16% of the load balancing.
>
> Please let us know what version of gromacs and cut and paste your
> cammands that you used to run gromacs (so we can verify that you ran
> on the same number of processors) and cut and paste a diff of the .mdp
> files (so that we can verify that you ran for the same number of steps).
>
> You might be correct about the slowdown, but let's rule out some other
> more obvious problems first.
>
> Chris.
>
> -- original message --
>
>
> Dear all,
> when I run a single free energy simulation
> i noticed that there is a loss of performace with respect to
> the normal MD
>
> free_energy    = yes
> init_lambda    = 0.9
> delta_lambda   = 0.0
> couple-moltype = Protein_Chain_P
> couple-lambda0 = vdw-q
> couple-lambda0 = none
> couple-intramol= yes
>
>     Average load imbalance: 16.3 %
>     Part of the total run time spent waiting due to load imbalance: 12.2 %
>     Steps where the load balancing was limited by -rdd, -rcon and/or -dds:
> X0 % Time:   1852.712   1852.712    100.0
>
> free_energy    = no
>     Average load imbalance: 2.7 %
>     Part of the total run time spent waiting due to load imbalance: 1.7 %
>     Time:    127.394    127.394    100.0
>
> It seems that the loss of performace is due in part to in the load
> imbalance in the domain decomposition, however I tried to change
> these keywords without benefit
> Any comment is welcome.
>
> Thanks





More information about the gromacs.org_gmx-users mailing list