[gmx-users] strange problem with parallel mpirun

David van der Spoel spoel at xray.bmc.uu.se
Sun Dec 4 13:53:57 CET 2005


Lubos Vrbka wrote:
> hi guys,
> 
> i encounter following problem when running mdrun in parallel:
> 
> our setup is 2-processor nodes
> run on 2 processors - 1 node (mpich - shmem) is ok
> run on 4 processors - 2 nodes (mpich - socket-p4) is ok
> run on 6 processors - 3 nodes (mpich - socket-p4) is not ok
> what happens? all 6 processes are started on the appropriate nodes. 
> however, only processes on the last node really calculate something 
> (they take ~98+99% of the processor time). processes on the first two 
> nodes can be found using ps uax - they don't consume enough resources to 
> be displayed in the output of top...
> 
> i tried to run it on 4 processors, 1 on each node - it was ok
> then with 5 processors, one on each node - the same problem - fifth 
> processor was running the calculation, the other 4 weren't doing anything.
> 
> i tried to find something in the archives, but to no avail :( this seems 
> really strange... does anyone know what's going on here? is there any 
> limit on the number of processors?
> 
> thank you for any hint. with best regards,
> 
What kind of system is this?

In general MPICH alwyas gives problems, at least with gromacs. If at all 
possible try it with LAM.


-- 
David.
________________________________________________________________________
David van der Spoel, PhD, Assoc. Prof., Molecular Biophysics group,
Dept. of Cell and Molecular Biology, Uppsala University.
Husargatan 3, Box 596,  	75124 Uppsala, Sweden
phone:	46 18 471 4205		fax: 46 18 511 755
spoel at xray.bmc.uu.se	spoel at gromacs.org   http://xray.bmc.uu.se/~spoel
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++



More information about the gromacs.org_gmx-users mailing list