[gmx-users] about parallel run

Yunierkis Perez Castillo yunierkis at uclv.edu.cu
Wed Jan 16 14:49:56 CET 2008


I decided to recompile mdrun with mpi support again and restart the
simulation again.
I also checked for lam running on the nodes and it was still running
nevertheless I tried to halt lam.
Now everything seems to be running ok.



On Tue, 2008-01-15 at 12:24 -0500, chris.neale at utoronto.ca wrote:

> > What i was trying to do is to run a parallel simulation. I have
> > successfully compiled mdrun with mpi support.
> >
> > This is my grompp command
> >
> > grompp -f md.mdp -c rec_pr.gro -p rec.top -o rec.tpr -np 12
> >
> > And this is the script I sent to the cluster:
> >
> > #!/bin/bash
> >
> > cd /home/yunierkis/MD
> > export LAMRSH="ssh -x"
> > lamboot -v .nodes
> > nohup mpirun -np 12 mdrun_mpi -s rec_md.tpr -o rec_md.trr -c rec_md.gro
> > -e rec.edr -g rec_md.log -nice 0 -np 12 &
> >
> 
> Are you sure that you are running Lam correctly on your machine?
> I would personally run it like this:
> /tools/lam/lam-7.1.2/bin/mpirun C mdrun_mpi -np 12 -deffnm rec_md
> 
> >
> > I have deleted all #md.trr.*# files and the simulation is still running
> > on all the 6 nodes and no new #md.trr.*# file have been created.
> > It sounds very strange for me and I can't find an explanation.
> 
> Did you have other #files# ? e.g. the .edr or .log ?
> 
> Do you lamhalt properly after previous attempts?
> 
> >
> > Yunierkis
> >
> >
> > On Tue, 2008-01-15 at 11:54 +1100, Mark Abraham wrote:
> >
> >> Yunierkis Perez Castillo wrote:
> >> > Hi all, I'm new to gromacs. I have setup a protein MD simulation in a
> >> > cluster, I'm using 6 computers with 2 CPUs each one.
> >> > After gromacs begun running I had 12 trajectory files in the folder the
> >> > output is written:
> >> >
> >> > md.trr
> >> > #md.trr.1#
> >> > #md.trr.2#
> >> > ................
> >> > #md.trr.11#
> >> >
> >> > It seems like the trajectory is replicated by each CPU the simulation is
> >> > running on.
> >> > All files has the same size, and grows  simultaneously as the simulation
> >> > advances.
> >> > Is that a normal thing??
> >> > Can I delete the #* files??
> >>
> >> I infer from your results that you've run 12 single-processor
> >> simulations from the same working directory. GROMACS makes backups of
> >> files when you direct it to write to an existing file, and these are
> >> numbered as #filename.index#. Your 12 simulations are all there, but you
> >> can't assume that those files with number 5 are all from the same
> >> simulation, because of the possibility of filesystem asynchronicities in
> >> creating the files.
> >>
> >> If you're trying to run 12 single-processor simulations in the same
> >> working directory, then you need to rethink your strategy. If you're
> >> trying to do something else, then you also need to rethink :-)
> >>
> >> Mark
> 
> 
> 
> _______________________________________________
> gmx-users mailing list    gmx-users at gromacs.org
> http://www.gromacs.org/mailman/listinfo/gmx-users
> Please search the archive at http://www.gromacs.org/search before posting!
> Please don't post (un)subscribe requests to the list. Use the 
> www interface or send it to gmx-users-request at gromacs.org.
> Can't post? Read http://www.gromacs.org/mailing_lists/users.php

__________________________________________________________________
Servicio de Correos del Grupo de Redes. UCLV
- Universidad 2008 del 11 al 15 de febrero del 2008.
Palacio de Convenciones. La Habana. Cuba. http: //www.universidad2008.cu 
- II Taller internacional -Virtualización en la Educación Superior-, del 11 al 15 de febrero de 2008 
Palacio de Convenciones. La Habana, Cuba. http://virtual-es.uclv.edu.cu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-users/attachments/20080116/026f7384/attachment.html>


More information about the gromacs.org_gmx-users mailing list