[gmx-users] mdrun mpi problem

Mark Abraham Mark.Abraham at anu.edu.au
Mon Dec 15 07:53:34 CET 2008


hazizian wrote:
> Hi
> 
> I have 3 computer and I want to do mdrun_mpi.
> I define this system in hostfile, lambhost-def, lam-hostmap.txt
> after lamboot and lamnodes It defines me that I have 3 nodes:
> 
> homa at dma210:~/etc> lamnodes
> n0      dma210.dma:2:origin,this_node
> n1      dma211.dma:2:
> n2      dma212.dma:2:
> 
> then I do:
> 
> homa at dma210:~/gromacs5> mpirun -np 6 mdrun_mpi4 -v -s md300.tpr -o md300.trr 
> -c md300.pdb -e md300.edr -g md300.log 
> 
> Getting Loaded...
> Reading file md300.tpr, VERSION 3.3.3 (single precision)

This should still work, but generating a .tpr with a corresponding 
grompp has to be more appropriate.

> Note: tpx file_version 40, software version 58
> Loaded with Money
> Making 1D domain decomposition 6 x 1 x 1
> 
> WARNING: This run will generate roughly 14158 Mb of data
> 
> starting mdrun 'SWISS-MODEL SERVER (http:'
> 10000000 steps,  10000.0 ps.
> step 0
> imb F 23% step 100, will finish Thu Jan  8 20:24:20 2009
> imb F 21% step 200, will finish Sat Jan 10 16:52:31 2009
> imb F 32% step 300, will finish Mon Jan 12 02:15:15 2009
>                                                                  
> 
> it works and when I do top it seems that on every computer 2 mdrun_mpi 
> execute. for example in one computer:
>   
>   
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>  2507 homa      16   0 48232  19m 4096 R   42  0.5   1:29.30 mdrun_mpi4
>  2508 homa      16   0 45868  16m 3768 S   33  0.4   0:57.30 mdrun_mpi4
>  2511 homa      16   0 67668  18m  15m R    1  0.5   0:00.48 konsole
> 
> one of my question is why doesn't the cpu percent of these 2 mpi is not 
> 100%?

If you don't have a dual-core machine, then one would not expect such 
behaviour.

> when I excecute this command one node this process is faster (will be 
> finished on 19 Dec. and the cpu percent is about 100% for all 6 job.
> 
> Is my setting will be problemic?

Probably your network connections are not fast enough to get benefit 
from your parallelism. There have been many discussions of this nature 
on this list. Have a search.

Mark



More information about the gromacs.org_gmx-users mailing list