[gmx-users] FEP and loss of performance
Luca Bellucci
lcbllcc at gmail.com
Mon Apr 4 17:37:13 CEST 2011
Dear Chris and Justin
Thank you for your precious suggestions
This is a test that i perform in a single machine with 8 cores
and gromacs 4.5.4.
I am trying to enhance the sampling of a protein using the decoupling scheme
of the free energy module of gromacs. However when i decouple only the
protein, the protein collapsed. Because i simulated in NVT i thought that
this was an effect of the solvent. I was trying to decouple also the solvent
to understand the system behavior.
I expected a loss of performance, but not so drastic.
Luca
> Load balancing problems I can understand, but why would it take longer
> in absolute time? I would have thought that some nodes would simple be
> sitting idle, but this should not cause an increase in the overall
> simulation time (15x at that!).
>
> There must be some extra communication?
>
> I agree with Justin that this seems like a strange thing to do, but
> still I think that there must be some underlying coding issue (probably
> one that only exists because of a reasonable assumption that nobody
> would annihilate the largest part of their system).
>
> Chris.
>
> Luca Bellucci wrote:
> >/ Hi Chris,
>
> />/ thank for the suggestions,
> />/ in the previous mail there is a mistake because
> />/ couple-moltype = SOL (for solvent) and not "Protein_chaim_P".
> />/ Now the problem of the load balance seems reasonable, because
> />/ the water box is large ~9.0 nm.
> /
> Now your outcome makes a lot more sense. You're decoupling all of the
> solvent? I don't see how that is going to be physically stable or terribly
> meaningful, but it explains your performance loss. You're annihilating a
> significant number of interactions (probably the vast majority of all the
> nonbonded interactions in the system), which I would expect would cause
> continuous load balancing issues.
>
> -Justin
>
> >/ However the problem exist and the performance loss is very high, so I
> > have
>
> />/ redone calculations with this command:
> />/
> />/ grompp -f
> />/ md.mdp -c ../Run-02/confout.gro -t ../Run-02/state.cpt -p ../topo.top
> -n ../index.ndx -o />/ md.tpr -maxwarn 1
> />/
> />/ mdrun -s md.tpr -o md
> />/
> />/ this is part of the md.mdp file:
> />/
> />/ ; Run parameters
> />/ ; define = -DPOSRES
> />/ integrator = md ;
> />/ nsteps = 1000 ;
> />/ dt = 0.002 ;
> />/ [..]
> />/ free_energy = yes ; /no
> />/ init_lambda = 0.9
> />/ delta_lambda = 0.0
> />/ couple-moltype = SOL ; solvent water
> />/ couple-lambda0 = vdw-q
> />/ couple-lambda1 = none
> />/ couple-intramol= yes
> />/
> />/ Result for free energy calculation
> />/ Computing: Nodes Number G-Cycles Seconds %
> />/
> ----------------------------------------------------------------------- />/
> Domain decomp. 8 126 22.050 8.3 0.1 />/
> DD comm. load 8 15 0.009 0.0 0.0 />/
> DD comm. bounds 8 12 0.031 0.0 0.0 />/
> Comm. coord. 8 1001 17.319 6.5 0.0 />/
> Neighbor search 8 127 436.569 163.7 1.1 />/
> Force 8 1001 34241.576 12840.9
> 87.8 />/ Wait + Comm. F 8 1001 19.486 7.3
> 0.0 />/ PME mesh 8 1001 4190.758 1571.6
> 10.7 />/ Write traj. 8 7 1.827
> 0.7 0.0 />/ Update 8 1001 12.557
> 4.7 0.0 />/ Constraints 8 1001 26.496
> 9.9 0.1 />/ Comm. energies 8 1002 10.710
> 4.0 0.0 />/ Rest 8 25.142
> 9.4 0.1 />/
> ----------------------------------------------------------------------- />/
> Total 8 39004.531 14627.1 100.0 />/
> ----------------------------------------------------------------------- />/
> -----------------------------------------------------------------------
> />/ PME redist. X/F 8 3003 3479.771 1304.9 8.9
> />/ PME spread/gather 8 4004 277.574 104.1 0.7 />/
> PME 3D-FFT 8 4004 378.090 141.8 1.0 />/
> PME solve 8 2002 55.033 20.6 0.1
> />/
> ----------------------------------------------------------------------- />/
> Parallel run - timing based on wallclock.
> />/
> />/ NODE (s) Real (s) (%)
> />/ Time: 1828.385 1828.385 100.0
> />/ 30:28
> />/ (Mnbf/s) (GFlops) (ns/day) (hour/ns)
> />/ Performance: 3.115 3.223 0.095 253.689
> />/
> />/ I Switched off only the free_energy keyword and I redone the
> calculation />/ I have:
> />/ Computing: Nodes Number G-Cycles Seconds %
> />/
> ----------------------------------------------------------------------- />/
> Domain decomp. 8 77 10.975 4.1 0.6 />/ DD
> comm. load 8 1 0.001 0.0 0.0 />/ Comm.
> coord. 8 1001 14.480 5.4 0.8 />/
> Neighbor search 8 78 136.479 51.2 7.3 />/
> Force 8 1001 1141.115 427.9 61.3
> />/ Wait + Comm. F 8 1001 17.845 6.7 1.0 />/
> PME mesh 8 1001 484.581 181.7 26.0 />/
> Write traj. 8 5 1.221 0.5 0.1 />/
> Update 8 1001 9.976 3.7 0.5 />/
> Constraints 8 1001 20.275 7.6 1.1 />/
> Comm. energies 8 992 5.933 2.2 0.3 />/ Rest
> 8 19.670 7.4 1.1 />/
> ----------------------------------------------------------------------- />/
> Total 8 1862.552 698.5 100.0 />/
> ----------------------------------------------------------------------- />/
> -----------------------------------------------------------------------
> />/ PME redist. X/F 8 2002 92.204 34.6 5.0
> />/ PME spread/gather 8 2002 192.337 72.1 10.3
> />/ PME 3D-FFT 8 2002 177.373 66.5 9.5
> />/ PME solve 8 1001 22.512 8.4 1.2
> />/
> ----------------------------------------------------------------------- />/
> Parallel run - timing based on wallclock.
> />/
> />/ NODE (s) Real (s) (%)
> />/ Time: 87.309 87.309 100.0
> />/ 1:27
> />/ (Mnbf/s) (GFlops) (ns/day) (hour/ns)
> />/ Performance: 439.731 23.995 1.981 12.114
> />/ Finished mdrun on node 0 Mon Apr 4 16:52:04 2011
> />/
> />/ Luca
> />/
> />/
> />/
> />/
> />>/ If we accept your text at face value, then the simulation slowed down
> />>/ by a factor of 1500%, certainly not the 16% of the load balancing.
> />>/
> />>/ Please let us know what version of gromacs and cut and paste your
> />>/ cammands that you used to run gromacs (so we can verify that you ran
> />>/ on the same number of processors) and cut and paste a diff of the
> .mdp />>/ files (so that we can verify that you ran for the same number of
> steps). />>/
> />>/ You might be correct about the slowdown, but let's rule out some
> other />>/ more obvious problems first.
> />>/
> />>/ Chris.
> />>/
> />>/ -- original message --
> />>/
> />>/
> />>/ Dear all,
> />>/ when I run a single free energy simulation
> />>/ i noticed that there is a loss of performace with respect to
> />>/ the normal MD
> />>/
> />>/ free_energy = yes
> />>/ init_lambda = 0.9
> />>/ delta_lambda = 0.0
> />>/ couple-moltype = Protein_Chain_P
> />>/ couple-lambda0 = vdw-q
> />>/ couple-lambda0 = none
> />>/ couple-intramol= yes
> />>/
> />>/ Average load imbalance: 16.3 %
> />>/ Part of the total run time spent waiting due to load imbalance:
> 12.2 % />>/ Steps where the load balancing was limited by -rdd, -rcon
> and/or -dds: />>/ X0 % Time: 1852.712 1852.712 100.0
> />>/
> />>/ free_energy = no
> />>/ Average load imbalance: 2.7 %
> />>/ Part of the total run time spent waiting due to load imbalance:
> 1.7 % />>/ Time: 127.394 127.394 100.0
> />>/
> />>/ It seems that the loss of performace is due in part to in the load
> />>/ imbalance in the domain decomposition, however I tried to change
> />>/ these keywords without benefit
> />>/ Any comment is welcome.
> /
More information about the gromacs.org_gmx-users
mailing list