[gmx-users] strange problem with parallel mpirun

Lubos Vrbka lubos.vrbka at gmail.com
Sun Dec 4 13:17:58 CET 2005


hi guys,

i encounter following problem when running mdrun in parallel:

our setup is 2-processor nodes
run on 2 processors - 1 node (mpich - shmem) is ok
run on 4 processors - 2 nodes (mpich - socket-p4) is ok
run on 6 processors - 3 nodes (mpich - socket-p4) is not ok
what happens? all 6 processes are started on the appropriate nodes. 
however, only processes on the last node really calculate something 
(they take ~98+99% of the processor time). processes on the first two 
nodes can be found using ps uax - they don't consume enough resources to 
be displayed in the output of top...

i tried to run it on 4 processors, 1 on each node - it was ok
then with 5 processors, one on each node - the same problem - fifth 
processor was running the calculation, the other 4 weren't doing anything.

i tried to find something in the archives, but to no avail :( this seems 
really strange... does anyone know what's going on here? is there any 
limit on the number of processors?

thank you for any hint. with best regards,

-- 
Lubos
_ at _"



More information about the gromacs.org_gmx-users mailing list