[gmx-users] Issues running Gromacs with MPI/OpenMP in cpu cluster

Thomas Schlesier schlesi at uni-mainz.de
Thu Feb 13 18:50:25 CET 2014


Hi,
I'm no expert for this stuff, but could it be that you generate about 40 
of the #my_mol.log.$n# files (probably only 39)?
It could be that the 'mpirun' starts 40 'mdrun'-jobs and each generates 
its own out put.
For GROMACS 4.6.x I always used
mdrun -nt X ...
to start a parallel run (where X would be 40 in your case). I think 
GROMACS 4.6.x has the MPI stuff build into it and therefore doesn't need 
an external 'mpirun' (but i could be wrong - I only know how to use the 
stuff, but don't completly understand it...)

Hope this helps a little

Greetings
Thomas



Am 13.02.2014 18:13, schrieb 
gromacs.org_gmx-users-request at maillist.sys.kth.se:
> Dear GROMACS users,
>
> I am facing a strange situation by running Gromacs (v - 4.6.3) in our local cpu-cluster using MPI/OpenMP parallelization process. I am trying to simulate a big heterogeneous aquas-polymer system in octahedron box.
>
> I use the following command to run my simulations and use sge queuing system to submit the job in the cluster.
> mpirun -nolocal -np 40 mdrun -ntomp 1
>
> Immediately after launching the job a huge number of backup files are getting generated in the same directory. Generally naming conventions of these backup files are #my_mol.log.$n# OR #my_mol.edr.$n#  etc, i.e. more than one backup log, .edr, .gro, etc files are generated, I suppose due to the parallelization. But, after running some steps the log file and the sge-error file starts complaining about the disk space although there is no disk space scarcity. Lowering the frequency of writing to the output did not help neither increasing -cpt from 15 -> 50 helped.
>
> The stranger part is here when I see the backup files are getting updated regularly individually at the same directory (although the main .log file stopped updating itself) and run for the entire steps (e.g. 10000 steps) individually as if $n number of individual simulations are running.
>
> My .mdp file is :
> title                        =  MD test run of 2 ns
> ; using Verlet scheme
> cutoff-scheme  = Verlet
> ; Run parameters
> integrator            = md
> nsteps                  = 1000000
> dt                            = 0.002
> ; Output control
> nstxout                = 10000
> nstvout                = 5000
> nstxtcout             = 1000
> nstenergy           = 1000
> nstlog                    = 1000
> ; Bond parameters
> continuation      = yes
> constraint_algorithm = lincs
> constraints          = all-bonds
> lincs_iter              = 1
> lincs_order         = 4
> ; Neighborsearching
> ns_type               = grid
> nstlist                    = 10
> rlist                         = 1.0
> rcoulomb             = 1.0
> vdw-type            = cut-off
> rvdw                      = 1.0
> ; Electrostatics
> coulombtype     = PME
> pme_order         = 4
> fourierspacing   = 0.16
> ; Temperature coupling is on
> tcoupl                   = V-rescale
> tc-grps                  = polymer SOL_Ion
> tau_t                     = 0.1      0.1
> ref_t                      = 300     300
> ; Pressure coupling is on
> pcoupl                  = Parrinello-Rahman
> pcoupltype         = isotropic
> tau_p                    = 2.0
> ref_p                     = 1.0
> compressibility = 4.5e-5
> ; Periodic boundary conditions
> pbc                         = xyz
> ; Dispersion correction
> DispCorr               = EnerPres
> ; Velocity generation
> gen_vel                = no
>
>
> The commands I used to run:
>
> J = i-1
> ### When I changing in .mdp file
> grompp -f md.mdp -c test_$j.tpr -o test_$i.tpr -t test_$j.cpt -p test_solv.top -n my_index.ndx >& log.grompp
> mpirun -nolocal -np 40 $GMX_HOME/bin/mdrun -ntomp 1 -s test_$i.tpr -deffnm test_$i -cpt 30
>
> ### When I try to extend the simulations
> tpbconv -s test_${j}.tpr -o test_${i}.tpr -extend 1000 >& log_${i}.tpbconv
> mpirun -nolocal -np 40 $GMX_HOME/bin/mdrun -s test_${i}.tpr -deffnm test_${i} -cpi test_${j}.cpt -cpt 30
>
>
> As this is my first attempt with the Gromacs and no one has ever used MD/Gromacs in my surrounding I could not verify this strange behavior. I have tried to search for the error in the web as well as in the community forum but not found any reference to this issue. I may not have searched with proper terminology however. I am also in touch with my sys admin who are also little bit lost in this. If anyone could help me to get out of this situation I would be highly obliged. Please let me know if I am not very explicit in describing the issue.
>
>
> Looking forward to hear from you.
> Thanks in advance,
> Mousumi
>
>
> Ontario Institute for Cancer Research
> MaRS Centre
> Toronto, Ontario



More information about the gromacs.org_gmx-users mailing list