Re: [gmx-users] Question about starting Gromacs 4.5.4 parallel runs using mpirun

Peter C. Lai pcl at uab.edu
Sat Apr 28 19:14:10 CEST 2012


oops I meant /usr/local/gromacs/bin in PATH

also 4.5.4 has support for threading if compiled that way...Most clusters don't use it because nodes on separate chassis usually don't support shared memory with each other. We get really bad performance when using threading-over-scalemp as compared to the typical openmpi setup on identical hardware and infrastructure.
-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.

"Peter C. Lai" <pcl at uab.edu> wrote:

possibly. Make sure PATH includes /usr/local/bin and LD_LIBRARY_PATH includes the path where libmd_mpi.so and libgmx_mpi.so are located.

try using mdrun -v and > somelogfile after the mpirun line (like mpirun blah -np 8 mdrun -v -deffnm etcetc > mpilog). the contents of mpilog will tell you what mpi is trying to do...
-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.

Andrew DeYoung <adeyoung at andrew.cmu.edu> wrote:

Hi,

Typically, I use Gromacs 4.5.5 compiled with automatic threading. As you
know, automatic threading is awesome because it allows me to start parallel
runs without calling mpirun. So on version 4.5.5, I can start a job on eight
CPUs using simply the command:

mdrun -s topol.tpr -nt 8

However, now I am using a different node on my department's cluster, and
this node instead has Gromacs 4.5.4 (compiled without automatic threading).
So, I must use mpirun to start parallel runs. I have tried this command:

mpirun -machinefile mymachines -np 8 mdrun -s topol.tpr

where mymachines is an (extensionless) file containing only the text "c60
slots=8". (c60 is the name of the node that I am using.)

I get this error message:

"Missing: program name. Program mdrun either does not exist, is not
executab le, or is an erroneous argument to mpirun."

This is strange, because mdrun is, I think, in my path. For example, if I
type "mdrun -h", I get the manual page for mdrun (version 4.5.4).

Then I tried the command "which mdrun", and it gave me this output:

/usr/local/gromacs/bin/mdrun

So, next I tried to call mdrun via mpirun using the specific path for mdrun:

mpirun -machinefile mymachines -np 8 /usr/local/gromacs/bin/mdrun -s
topol.tpr

This starts running my simulation, but when I look in "top", the simulation
is only running on a single CPU; there is only one entry for mdrun in "top",
and it has only %CPU=100 (not eight different entries for mdrun, nor one
entry with %CPU=800). Also, the simulation is going at the speed I would
expect for running on a single CPU -- it is very slow, so I am convinced
that, as "top" suggests, mdrun is running on only one CPU.

Strangely, my col leagues are able to run jobs in parallel using the exact
commands that I described above. So apparently something is wrong with my
user ID, although there are no error messages (except the error message
about "Missing: program name" that I described).

If you have time, do you have any suggestions for other things that I can
try? Do you think that something could be wrong with my bashrc file?

Thanks for your time!

Andrew DeYoung
Carnegie Mellon University

-- 
gmx-users mailing list gmx-users at gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-request at gromacs.org.
Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-users/attachments/20120428/1eab2ca6/attachment.html>


More information about the gromacs.org_gmx-users mailing list