[gmx-users] FEP and loss of performance

Luca Bellucci lcbllcc at gmail.com
Mon Apr 4 17:37:13 CEST 2011


Dear Chris and Justin
Thank you for your precious suggestions 
This is a test that i perform in a single machine with 8 cores 
and gromacs 4.5.4.

I am trying  to enhance the  sampling of a protein using the decoupling scheme 
of the free energy module of gromacs.  However when i decouple only the 
protein, the protein collapsed. Because i simulated in NVT i thought that 
this was an effect of the solvent. I was trying to decouple also the solvent 
to understand the system behavior.

 I expected a loss of performance, but not so drastic. 
Luca 

> Load balancing problems I can understand, but why would it take longer
> in absolute time? I would have thought that some nodes would simple be
> sitting idle, but this should not cause an increase in the overall
> simulation time (15x at that!).
>
> There must be some extra communication?
>
> I agree with Justin that this seems like a strange thing to do, but
> still I think that there must be some underlying coding issue (probably
> one that only exists because of a reasonable assumption that nobody
> would annihilate the largest part of their system).
>
> Chris.
>
> Luca Bellucci wrote:
> >/  Hi Chris,
>
> />/  thank for the suggestions,
> />/  in the previous mail there is a mistake because
> />/  couple-moltype = SOL (for solvent) and not "Protein_chaim_P".
> />/  Now the problem of the load balance seems reasonable, because
> />/  the water box is large ~9.0 nm.
> /
> Now your outcome makes a lot more sense.  You're decoupling all of the
> solvent? I don't see how that is going to be physically stable or terribly
> meaningful, but it explains your performance loss.  You're annihilating a
> significant number of interactions (probably the vast majority of all the
> nonbonded interactions in the system), which I would expect would cause
> continuous load balancing issues.
>
> -Justin
>
> >/  However the problem exist and the performance loss is very high, so I
> > have
>
> />/  redone calculations with this command:
> />/
> />/  grompp -f
> />/  md.mdp -c ../Run-02/confout.gro -t ../Run-02/state.cpt -p ../topo.top
> -n ../index.ndx -o />/  md.tpr -maxwarn 1
> />/
> />/  mdrun -s md.tpr -o md
> />/
> />/  this is part of the md.mdp file:
> />/
> />/  ; Run parameters
> />/  ; define          = -DPOSRES
> />/  integrator	= md		;
> />/  nsteps		= 1000 	;
> />/  dt		= 0.002		;
> />/  [..]
> />/  free_energy    = yes     ; /no
> />/  init_lambda    = 0.9
> />/  delta_lambda   = 0.0
> />/  couple-moltype = SOL    ; solvent water
> />/  couple-lambda0 = vdw-q
> />/  couple-lambda1 = none
> />/  couple-intramol= yes
> />/
> />/  Result for free energy calculation
> />/   Computing:         Nodes     Number     G-Cycles    Seconds     %
> />/ 
> ----------------------------------------------------------------------- />/
>   Domain decomp.       8        126       22.050        8.3     0.1 />/  
> DD comm. load          8         15        0.009        0.0     0.0 />/  
> DD comm. bounds     8         12        0.031        0.0     0.0 />/  
> Comm. coord.            8       1001       17.319        6.5     0.0 />/  
> Neighbor search        8        127      436.569      163.7     1.1 />/  
> Force                           8       1001    34241.576    12840.9   
> 87.8 />/   Wait + Comm. F        8       1001       19.486        7.3    
> 0.0 />/   PME mesh                  8       1001     4190.758     1571.6   
> 10.7 />/   Write traj.                  8          7        1.827       
> 0.7     0.0 />/   Update                      8       1001       12.557    
>    4.7     0.0 />/   Constraints               8       1001       26.496   
>     9.9     0.1 />/   Comm. energies      8       1002       10.710       
> 4.0     0.0 />/   Rest                   8                  25.142       
> 9.4     0.1 />/ 
> ----------------------------------------------------------------------- />/
>   Total                  8               39004.531    14627.1   100.0 />/ 
> ----------------------------------------------------------------------- />/
>  -----------------------------------------------------------------------
> />/   PME redist. X/F          8       3003     3479.771     1304.9     8.9
> />/   PME spread/gather   8       4004      277.574      104.1     0.7 />/ 
>  PME 3D-FFT               8       4004      378.090      141.8     1.0 />/ 
>  PME solve                  8       2002       55.033       20.6     0.1
> />/ 
> ----------------------------------------------------------------------- />/
>  	Parallel run - timing based on wallclock.
> />/
> />/                 NODE (s)   Real (s)      (%)
> />/         Time:   1828.385   1828.385    100.0
> />/                         30:28
> />/                               (Mnbf/s)   (GFlops)   (ns/day)  (hour/ns)
> />/  Performance:      3.115      3.223      0.095    253.689
> />/
> />/   I Switched off only the free_energy keyword and I redone the
> calculation />/  I have:
> />/   Computing:         Nodes     Number     G-Cycles    Seconds     %
> />/ 
> ----------------------------------------------------------------------- />/
>   Domain decomp.      8         77       10.975        4.1     0.6 />/   DD
> comm. load         8          1        0.001        0.0     0.0 />/   Comm.
> coord.           8       1001       14.480        5.4     0.8 />/  
> Neighbor search       8         78      136.479       51.2     7.3 />/  
> Force                         8       1001     1141.115      427.9    61.3
> />/   Wait + Comm. F      8       1001       17.845        6.7     1.0 />/ 
>  PME mesh                8       1001      484.581      181.7    26.0 />/  
> Write traj.               8          5        1.221        0.5     0.1 />/ 
>  Update                   8       1001        9.976        3.7     0.5 />/ 
>  Constraints            8       1001       20.275        7.6     1.1 />/  
> Comm. energies     8        992        5.933        2.2     0.3 />/   Rest 
>                        8                  19.670        7.4     1.1 />/ 
> ----------------------------------------------------------------------- />/
>   Total                  8                1862.552      698.5   100.0 />/ 
> ----------------------------------------------------------------------- />/
>  -----------------------------------------------------------------------
> />/   PME redist. X/F        8       2002       92.204       34.6     5.0
> />/   PME spread/gather      8       2002      192.337       72.1    10.3
> />/   PME 3D-FFT             8       2002      177.373       66.5     9.5
> />/   PME solve              8       1001       22.512        8.4     1.2
> />/ 
> ----------------------------------------------------------------------- />/
>  	Parallel run - timing based on wallclock.
> />/
> />/                 NODE (s)   Real (s)      (%)
> />/         Time:     87.309     87.309    100.0
> />/                         1:27
> />/                           (Mnbf/s)   (GFlops)   (ns/day)  (hour/ns)
> />/  Performance:    439.731     23.995      1.981     12.114
> />/  Finished mdrun on node 0 Mon Apr  4 16:52:04 2011
> />/
> />/  Luca
> />/
> />/
> />/
> />/
> />>/  If we accept your text at face value, then the simulation slowed down
> />>/  by a factor of 1500%, certainly not the 16% of the load balancing.
> />>/
> />>/  Please let us know what version of gromacs and cut and paste your
> />>/  cammands that you used to run gromacs (so we can verify that you ran
> />>/  on the same number of processors) and cut and paste a diff of the
> .mdp />>/  files (so that we can verify that you ran for the same number of
> steps). />>/
> />>/  You might be correct about the slowdown, but let's rule out some
> other />>/  more obvious problems first.
> />>/
> />>/  Chris.
> />>/
> />>/  -- original message --
> />>/
> />>/
> />>/  Dear all,
> />>/  when I run a single free energy simulation
> />>/  i noticed that there is a loss of performace with respect to
> />>/  the normal MD
> />>/
> />>/  free_energy    = yes
> />>/  init_lambda    = 0.9
> />>/  delta_lambda   = 0.0
> />>/  couple-moltype = Protein_Chain_P
> />>/  couple-lambda0 = vdw-q
> />>/  couple-lambda0 = none
> />>/  couple-intramol= yes
> />>/
> />>/      Average load imbalance: 16.3 %
> />>/      Part of the total run time spent waiting due to load imbalance:
> 12.2 % />>/      Steps where the load balancing was limited by -rdd, -rcon
> and/or -dds: />>/  X0 % Time:   1852.712   1852.712    100.0
> />>/
> />>/  free_energy    = no
> />>/      Average load imbalance: 2.7 %
> />>/      Part of the total run time spent waiting due to load imbalance:
> 1.7 % />>/      Time:    127.394    127.394    100.0
> />>/
> />>/  It seems that the loss of performace is due in part to in the load
> />>/  imbalance in the domain decomposition, however I tried to change
> />>/  these keywords without benefit
> />>/  Any comment is welcome.
> /





More information about the gromacs.org_gmx-users mailing list