[gmx-users] strange problem with parallel mpirun
David van der Spoel
spoel at xray.bmc.uu.se
Sun Dec 4 13:53:57 CET 2005
Lubos Vrbka wrote:
> hi guys,
> i encounter following problem when running mdrun in parallel:
> our setup is 2-processor nodes
> run on 2 processors - 1 node (mpich - shmem) is ok
> run on 4 processors - 2 nodes (mpich - socket-p4) is ok
> run on 6 processors - 3 nodes (mpich - socket-p4) is not ok
> what happens? all 6 processes are started on the appropriate nodes.
> however, only processes on the last node really calculate something
> (they take ~98+99% of the processor time). processes on the first two
> nodes can be found using ps uax - they don't consume enough resources to
> be displayed in the output of top...
> i tried to run it on 4 processors, 1 on each node - it was ok
> then with 5 processors, one on each node - the same problem - fifth
> processor was running the calculation, the other 4 weren't doing anything.
> i tried to find something in the archives, but to no avail :( this seems
> really strange... does anyone know what's going on here? is there any
> limit on the number of processors?
> thank you for any hint. with best regards,
What kind of system is this?
In general MPICH alwyas gives problems, at least with gromacs. If at all
possible try it with LAM.
David van der Spoel, PhD, Assoc. Prof., Molecular Biophysics group,
Dept. of Cell and Molecular Biology, Uppsala University.
Husargatan 3, Box 596, 75124 Uppsala, Sweden
phone: 46 18 471 4205 fax: 46 18 511 755
spoel at xray.bmc.uu.se spoel at gromacs.org http://xray.bmc.uu.se/~spoel
More information about the gromacs.org_gmx-users