[gmx-users] MPICH or LAM/MPI

Tue Jun 27 07:30:49 CEST 2006

Hi All,

Ok, I've successfully created the mpi version of mdrun. Am now trying to
run my simulation on 32 processors. After processing with grompp and the
option -np 32, I use mdrun with the following script (where CONF is the
input file, NPROC is the number of processors):

/opt/mpich/intel/bin/mpirun -v -np $NPROC -machinefile \$TMPDIR/machines
~/gromacs-mpi/bin/mdrun -np $NPROC -s $CONF -o $CONF -c After$CONF -e
$CONF -g $CONF >& $CONF.job

Everything seems to start up ok, but then GMX stalls (it never actually
starts the simulation. It stalls for about 7 minutes then completely
aborts).  I've pasted the log file below, which shows that the
simulation stalls at Step 0, but there's no discernible error (only
claims that AMD 3D Now support is not available, which makes sense b/c
I'm not running on AMD).

If you scroll further down, I've also pasted the job file, FullMD7.job,
which is normally empty if everything is running smoothly.  There seems
to be some errors at the end, but they're rather cryptic to me, nor am I
sure if this is a cause or effect.  If anyone has any suggestions, I'd
love to hear them.

Thanks,

Arneh

*****FullMD70.log******

Log file opened on Mon Jun 26 21:51:55 2006
Host: compute-0-1.local  pid: 13353  nodeid: 0  nnodes:  32
The Gromacs distribution was built Wed Jun 21 16:01:01 PDT 2006 by
ababakha at chemcca40.ucsd.edu (Linux 2.6.9-22.ELsmp i686)

                         :-)  G  R  O  M  A  C  S  (-:

                   Groningen Machine for Chemical Simulation

                            :-)  VERSION 3.3.1  (-:

      Written by David van der Spoel, Erik Lindahl, Berk Hess, and others.
       Copyright (c) 1991-2000, University of Groningen, The Netherlands.
             Copyright (c) 2001-2006, The GROMACS development team,
            check out http://www.gromacs.org for more information.

         This program is free software; you can redistribute it and/or
          modify it under the terms of the GNU General Public License
         as published by the Free Software Foundation; either version 2
             of the License, or (at your option) any later version.

                 :-)  /home/ababakha/gromacs-mpi/bin/mdrun  (-:

++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
E. Lindahl and B. Hess and D. van der Spoel
GROMACS 3.0: A package for molecular simulation and trajectory analysis
J. Mol. Mod. 7 (2001) pp. 306-317
-------- -------- --- Thank You --- -------- --------

++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
H. J. C. Berendsen, D. van der Spoel and R. van Drunen
GROMACS: A message-passing parallel molecular dynamics implementation
Comp. Phys. Comm. 91 (1995) pp. 43-56
-------- -------- --- Thank You --- -------- --------

CPU=  0, lastcg=  515, targetcg= 5799, myshift=   14
CPU=  1, lastcg= 1055, targetcg= 6339, myshift=   15
CPU=  2, lastcg= 1595, targetcg= 6879, myshift=   16
CPU=  3, lastcg= 2135, targetcg= 7419, myshift=   17
CPU=  4, lastcg= 2675, targetcg= 7959, myshift=   18
CPU=  5, lastcg= 3215, targetcg= 8499, myshift=   19
CPU=  6, lastcg= 3755, targetcg= 9039, myshift=   20
CPU=  7, lastcg= 4112, targetcg= 9396, myshift=   20
CPU=  8, lastcg= 4381, targetcg= 9665, myshift=   20
CPU=  9, lastcg= 4650, targetcg= 9934, myshift=   20
CPU= 10, lastcg= 4919, targetcg=10203, myshift=   20
CPU= 11, lastcg= 5188, targetcg=10472, myshift=   20
CPU= 12, lastcg= 5457, targetcg=  174, myshift=   20
CPU= 13, lastcg= 5726, targetcg=  443, myshift=   19
CPU= 14, lastcg= 5995, targetcg=  712, myshift=   19
CPU= 15, lastcg= 6264, targetcg=  981, myshift=   18
CPU= 16, lastcg= 6533, targetcg= 1250, myshift=   18
CPU= 17, lastcg= 6802, targetcg= 1519, myshift=   17
CPU= 18, lastcg= 7071, targetcg= 1788, myshift=   17
CPU= 19, lastcg= 7340, targetcg= 2057, myshift=   16
CPU= 20, lastcg= 7609, targetcg= 2326, myshift=   16
CPU= 21, lastcg= 7878, targetcg= 2595, myshift=   15
CPU= 22, lastcg= 8147, targetcg= 2864, myshift=   15
CPU= 23, lastcg= 8416, targetcg= 3133, myshift=   14
CPU= 24, lastcg= 8685, targetcg= 3402, myshift=   14
CPU= 25, lastcg= 8954, targetcg= 3671, myshift=   13
CPU= 26, lastcg= 9223, targetcg= 3940, myshift=   13
CPU= 27, lastcg= 9492, targetcg= 4209, myshift=   13
CPU= 28, lastcg= 9761, targetcg= 4478, myshift=   13
CPU= 29, lastcg=10029, targetcg= 4746, myshift=   13
CPU= 30, lastcg=10298, targetcg= 5015, myshift=   13
CPU= 31, lastcg=10566, targetcg= 5283, myshift=   13
nsb->shift =  20, nsb->bshift=  0
Listing Scalars
nsb->nodeid:         0
nsb->nnodes:     32
nsb->cgtotal: 10567
nsb->natoms:  25925
nsb->shift:      20
nsb->bshift:      0
Nodeid   index  homenr  cgload  workload
     0       0     788     516       516
     1     788     828    1056      1056
     2    1616     828    1596      1596
     3    2444     828    2136      2136
     4    3272     828    2676      2676
     5    4100     828    3216      3216
     6    4928     828    3756      3756
     7    5756     807    4113      4113
     8    6563     807    4382      4382
     9    7370     807    4651      4651
    10    8177     807    4920      4920
    11    8984     807    5189      5189
    12    9791     807    5458      5458
    13   10598     807    5727      5727
    14   11405     807    5996      5996
    15   12212     807    6265      6265
    16   13019     807    6534      6534
    17   13826     807    6803      6803
    18   14633     807    7072      7072
    19   15440     807    7341      7341
    20   16247     807    7610      7610
    21   17054     807    7879      7879
    22   17861     807    8148      8148
    23   18668     807    8417      8417
    24   19475     807    8686      8686
    25   20282     807    8955      8955
    26   21089     807    9224      9224
    27   21896     807    9493      9493
    28   22703     807    9762      9762
    29   23510     804   10030     10030
    30   24314     807   10299     10299
    31   25121     804   10567     10567

parameters of the run:
   integrator           = md
   nsteps               = 1500000
   init_step            = 0
   ns_type              = Grid
   nstlist              = 10
   ndelta               = 2
   bDomDecomp           = FALSE
   decomp_dir           = 0
   nstcomm              = 1
   comm_mode            = Linear
   nstcheckpoint        = 1000
   nstlog               = 10
   nstxout              = 500
   nstvout              = 1000
   nstfout              = 0
   nstenergy            = 10
   nstxtcout            = 0
   init_t               = 0
   delta_t              = 0.002
   xtcprec              = 1000
   nkx                  = 64
   nky                  = 64
   nkz                  = 80
   pme_order            = 6
   ewald_rtol           = 1e-05
   ewald_geometry       = 0
   epsilon_surface      = 0
   optimize_fft         = TRUE
   ePBC                 = xyz
   bUncStart            = FALSE
   bShakeSOR            = FALSE
   etc                  = Berendsen
   epc                  = Berendsen
   epctype              = Semiisotropic
   tau_p                = 1
   ref_p (3x3):
      ref_p[    0]={ 1.00000e+00,  0.00000e+00,  0.00000e+00}
      ref_p[    1]={ 0.00000e+00,  1.00000e+00,  0.00000e+00}
      ref_p[    2]={ 0.00000e+00,  0.00000e+00,  1.00000e+00}
   compress (3x3):
      compress[    0]={ 4.50000e-05,  0.00000e+00,  0.00000e+00}
      compress[    1]={ 0.00000e+00,  4.50000e-05,  0.00000e+00}
      compress[    2]={ 0.00000e+00,  0.00000e+00,  1.00000e-30}
   andersen_seed        = 815131
   rlist                = 0.9
   coulombtype          = PME
   rcoulomb_switch      = 0
   rcoulomb             = 0.9
   vdwtype              = Cut-off
   rvdw_switch          = 0
   rvdw                 = 1.4
   epsilon_r            = 1
   epsilon_rf           = 1
   tabext               = 1
   gb_algorithm         = Still
   nstgbradii           = 1
   rgbradii             = 2
   gb_saltconc          = 0
   implicit_solvent     = No
   DispCorr             = No
   fudgeQQ              = 1
   free_energy          = no
   init_lambda          = 0
   sc_alpha             = 0
   sc_power             = 0
   sc_sigma             = 0.3
   delta_lambda         = 0
   disre_weighting      = Conservative
   disre_mixed          = FALSE
   dr_fc                = 1000
   dr_tau               = 0
   nstdisreout          = 100
   orires_fc            = 0
   orires_tau           = 0
   nstorireout          = 100
   dihre-fc             = 1000
   dihre-tau            = 0
   nstdihreout          = 100
   em_stepsize          = 0.01
   em_tol               = 10
   niter                = 20
   fc_stepsize          = 0
   nstcgsteep           = 1000
   nbfgscorr            = 10
   ConstAlg             = Lincs
   shake_tol            = 1e-04
   lincs_order          = 4
   lincs_warnangle      = 30
   lincs_iter           = 1
   bd_fric              = 0
   ld_seed              = 1993
   cos_accel            = 0
   deform (3x3):
      deform[    0]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
      deform[    1]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
      deform[    2]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
   userint1             = 0
   userint2             = 0
   userint3             = 0
   userint4             = 0
   userreal1            = 0
   userreal2            = 0
   userreal3            = 0
   userreal4            = 0
grpopts:
   nrdf:         11903.3     39783.7     285.983
   ref_t:             310         310         310
   tau_t:             0.1         0.1         0.1
anneal:                  No          No          No
ann_npoints:               0           0           0
   acc:               0           0           0
   nfreeze:           N           N           N
   energygrp_flags[  0]: 0
   efield-x:
      n = 0
   efield-xt:
      n = 0
   efield-y:
      n = 0
   efield-yt:
      n = 0
   efield-z:
      n = 0
   efield-zt:
      n = 0
   bQMMM                = FALSE
   QMconstraints        = 0
   QMMMscheme           = 0
   scalefactor          = 1
qm_opts:
   ngQM                 = 0
Max number of graph edges per atom is 4
Table routines are used for coulomb: TRUE
Table routines are used for vdw:     FALSE
Using a Gaussian width (1/beta) of 0.288146 nm for Ewald
Cut-off's:   NS: 0.9   Coulomb: 0.9   LJ: 1.4
System total charge: 0.000
Generated table with 1200 data points for Ewald.
Tabscale = 500 points/nm
Generated table with 1200 data points for LJ6.
Tabscale = 500 points/nm
Generated table with 1200 data points for LJ12.
Tabscale = 500 points/nm
Generated table with 500 data points for 1-4 COUL.
Tabscale = 500 points/nm
Generated table with 500 data points for 1-4 LJ6.
Tabscale = 500 points/nm
Generated table with 500 data points for 1-4 LJ12.
Tabscale = 500 points/nm

Enabling SPC water optimization for 6631 molecules.

Will do PME sum in reciprocal space.

++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
U. Essman, L. Perela, M. L. Berkowitz, T. Darden, H. Lee and L. G. Pedersen
A smooth particle mesh Ewald method
J. Chem. Phys. 103 (1995) pp. 8577-8592
-------- -------- --- Thank You --- -------- --------

Parallelized PME sum used.
PARALLEL FFT DATA:
   local_nx:                   2  local_x_start:                   0
   local_ny_after_transpose:   2  local_y_start_after_transpose    0
Removing pbc first time
Done rmpbc
Center of mass motion removal mode is Linear
We have the following groups for center of mass motion removal:
  0:  rest, initial mass: 207860
There are: 788 Atoms

Constraining the starting coordinates (step -2)

++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
H. J. C. Berendsen, J. P. M. Postma, A. DiNola and J. R. Haak
Molecular dynamics with coupling to an external bath
J. Chem. Phys. 81 (1984) pp. 3684-3690
-------- -------- --- Thank You --- -------- --------

++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
B. Hess and H. Bekker and H. J. C. Berendsen and J. G. E. M. Fraaije
LINCS: A Linear Constraint Solver for molecular simulations
J. Comp. Chem. 18 (1997) pp. 1463-1472
-------- -------- --- Thank You --- -------- --------

Initializing LINear Constraint Solver
  number of constraints is 776
  average number of constraints coupled to one constraint is 2.5

   Rel. Constraint Deviation:  Max    between atoms     RMS
       Before LINCS         0.008664     87     88   0.003001
        After LINCS         0.000036     95     96   0.000005

Constraining the coordinates at t0-dt (step -1)
   Rel. Constraint Deviation:  Max    between atoms     RMS
       Before LINCS         0.093829     12     13   0.009919
        After LINCS         0.000131     11     14   0.000021

Started mdrun on node 0 Mon Jun 26 21:52:34 2006
Initial temperature: 310.388 K
           Step           Time         Lambda
              0        0.00000        0.00000

Grid: 8 x 8 x 13 cells
Configuring nonbonded kernels...
Testing AMD 3DNow support... not present.
Testing ia32 SSE support... present.

********FullMD7.job***************

*running /home/ababakha/gromacs-mpi/bin/mdrun on 32 LINUX ch_p4 processors
Created /home/ababakha/SMDPeptideSimulation/CapParSMD/FullMD/PI12637
NNODES=32, MYRANK=0, HOSTNAME=compute-0-1.local
NNODES=32, MYRANK=1, HOSTNAME=compute-0-1.local
NNODES=32, MYRANK=30, HOSTNAME=compute-0-29.local
NNODES=32, MYRANK=24, HOSTNAME=compute-0-12.local
NNODES=32, MYRANK=28, HOSTNAME=compute-0-30.local
NNODES=32, MYRANK=3, HOSTNAME=compute-0-26.local
NNODES=32, MYRANK=14, HOSTNAME=compute-0-22.local
NNODES=32, MYRANK=6, HOSTNAME=compute-0-31.local
NNODES=32, MYRANK=8, HOSTNAME=compute-0-20.local
NNODES=32, MYRANK=7, HOSTNAME=compute-0-31.local
NNODES=32, MYRANK=18, HOSTNAME=compute-0-27.local
NNODES=32, MYRANK=2, HOSTNAME=compute-0-26.local
NNODES=32, MYRANK=23, HOSTNAME=compute-0-4.local
NNODES=32, MYRANK=31, HOSTNAME=compute-0-29.local
NNODES=32, MYRANK=5, HOSTNAME=compute-0-21.local
NNODES=32, MYRANK=27, HOSTNAME=compute-0-3.local
NNODES=32, MYRANK=4, HOSTNAME=compute-0-21.local
NNODES=32, MYRANK=20, HOSTNAME=compute-0-8.local
NNODES=32, MYRANK=11, HOSTNAME=compute-0-7.local
NNODES=32, MYRANK=9, HOSTNAME=compute-0-20.local
NNODES=32, MYRANK=12, HOSTNAME=compute-0-19.local
NNODES=32, MYRANK=13, HOSTNAME=compute-0-19.local
NNODES=32, MYRANK=21, HOSTNAME=compute-0-8.local
NNODES=32, MYRANK=22, HOSTNAME=compute-0-4.local
NNODES=32, MYRANK=10, HOSTNAME=compute-0-7.local
NNODES=32, MYRANK=17, HOSTNAME=compute-0-25.local
NNODES=32, MYRANK=25, HOSTNAME=compute-0-12.local
NNODES=32, MYRANK=15, HOSTNAME=compute-0-22.local
NNODES=32, MYRANK=29, HOSTNAME=compute-0-30.local
NNODES=32, MYRANK=19, HOSTNAME=compute-0-27.local
NNODES=32, MYRANK=26, HOSTNAME=compute-0-3.local
NNODES=32, MYRANK=16, HOSTNAME=compute-0-25.local
NODEID=26 argc=13
NODEID=25 argc=13
NODEID=24 argc=13
NODEID=23 argc=13
NODEID=22 argc=13
NODEID=21 argc=13
NODEID=20 argc=13
NODEID=19 argc=13
NODEID=18 argc=13
NODEID=13 argc=13
NODEID=17 argc=13
NODEID=15 argc=13
NODEID=14 argc=13
NODEID=16 argc=13
NODEID=0 argc=13
NODEID=12 argc=13
NODEID=6 argc=13
NODEID=11 argc=13
NODEID=1 argc=13
NODEID=10 argc=13
NODEID=5 argc=13
NODEID=30 argc=13
NODEID=7 argc=13
NODEID=27 argc=13
NODEID=31 argc=13
NODEID=2 argc=13
NODEID=9 argc=13
NODEID=28 argc=13
NODEID=4 argc=13
NODEID=29 argc=13
NODEID=8 argc=13
NODEID=3 argc=13
                         :-)  G  R  O  M  A  C  S  (-:

                   Groningen Machine for Chemical Simulation

                            :-)  VERSION 3.3.1  (-:

      Written by David van der Spoel, Erik Lindahl, Berk Hess, and others.
       Copyright (c) 1991-2000, University of Groningen, The Netherlands.
             Copyright (c) 2001-2006, The GROMACS development team,
            check out http://www.gromacs.org for more information.

         This program is free software; you can redistribute it and/or
          modify it under the terms of the GNU General Public License
         as published by the Free Software Foundation; either version 2
             of the License, or (at your option) any later version.

                 :-)  /home/ababakha/gromacs-mpi/bin/mdrun  (-:

Option     Filename  Type         Description
------------------------------------------------------------
  -s    FullMD7.tpr  Input        Generic run input: tpr tpb tpa xml
  -o    FullMD7.trr  Output       Full precision trajectory: trr trj
  -x       traj.xtc  Output, Opt. Compressed trajectory (portable xdr
format)
  -c AfterFullMD7.gro  Output       Generic structure: gro g96 pdb xml
  -e    FullMD7.edr  Output       Generic energy: edr ene
  -g    FullMD7.log  Output       Log file
-dgdl      dgdl.xvg  Output, Opt. xvgr/xmgr file
-field    field.xvg  Output, Opt. xvgr/xmgr file
-table    table.xvg  Input, Opt.  xvgr/xmgr file
-tablep  tablep.xvg  Input, Opt.  xvgr/xmgr file
-rerun    rerun.xtc  Input, Opt.  Generic trajectory: xtc trr trj gro
g96 pdb
-tpi        tpi.xvg  Output, Opt. xvgr/xmgr file
-ei        sam.edi  Input, Opt.  ED sampling input
-eo        sam.edo  Output, Opt. ED sampling output
  -j       wham.gct  Input, Opt.  General coupling stuff
-jo        bam.gct  Output, Opt. General coupling stuff
-ffout      gct.xvg  Output, Opt. xvgr/xmgr file
-devout   deviatie.xvg  Output, Opt. xvgr/xmgr file
-runav  runaver.xvg  Output, Opt. xvgr/xmgr file
-pi       pull.ppa  Input, Opt.  Pull parameters
-po    pullout.ppa  Output, Opt. Pull parameters
-pd       pull.pdo  Output, Opt. Pull data output
-pn       pull.ndx  Input, Opt.  Index file
-mtx         nm.mtx  Output, Opt. Hessian matrix
-dn     dipole.ndx  Output, Opt. Index file

      Option   Type  Value  Description
------------------------------------------------------
      -[no]h   bool     no  Print help info and quit
      -[no]X   bool     no  Use dialog box GUI to edit command line options
       -nice    int     19  Set the nicelevel
     -deffnm string         Set the default filename for all file options
   -[no]xvgr   bool    yes  Add specific codes (legends etc.) in the output
                            xvg files for the xmgrace program
         -np    int     32  Number of nodes, must be the same as used for
                            grompp
         -nt    int      1  Number of threads to start on each node
      -[no]v   bool     no  Be loud and noisy
-[no]compact   bool    yes  Write a compact log file
-[no]sepdvdl   bool     no  Write separate V and dVdl terms for each
                            interaction type and node to the log file(s)
  -[no]multi   bool     no  Do multiple simulations in parallel (only with
                            -np > 1)
     -replex    int      0  Attempt replica exchange every # steps
     -reseed    int     -1  Seed for replica exchange, -1 is generate a seed
   -[no]glas   bool     no  Do glass simulation with special long range
                            corrections
-[no]ionize   bool     no  Do a simulation including the effect of an X-Ray
                            bombardment on your system

Reading file FullMD7.tpr, VERSION 3.3.1 (single precision)
starting mdrun 'My membrane with peptides in water'
1500000 steps,   3000.0 ps.

p30_10831:  p4_error: Timeout in establishing connection to remote
process: 0
rm_l_30_10832: (341.608281) net_send: could not write to fd=5, errno = 32
rm_l_31_10896: (341.269706) net_send: could not write to fd=5, errno = 32
p30_10831: (343.634411) net_send: could not write to fd=5, errno = 32
p31_10895: (343.296105) net_send: could not write to fd=5, errno = 32
p0_13353:  p4_error: net_recv read:  probable EOF on socket: 1
Killed by signal 2.
Killed by signal 2.
Killed by signal 2.
Killed by signal 2.
Killed by signal 2.
p0_13353: (389.926083) net_send: could not write to fd=4, errno = 32