[gmx-users] Fwd: Trouble with mpirun gmx- 2/2 machines not sharing correctly

Szilárd Páll pall.szilard at gmail.com
Fri Mar 11 01:41:35 CET 2016


On Thu, Mar 10, 2016 at 7:24 AM, Jacob Nowatzke <jn681 at humboldt.edu> wrote:

> I've recently installed OpenMPI 1.10.2 alongside GROMACS 5.1.2. I've read
> all over the web about how to, but I'm having issues left and right. I've
> finally gotten to a point where *gmx* works, but in an odd way. I'll try to
> explain without too much detail.
>
>
> Machine1=
> CPU: 8-core AMD FX9590
> GPU: AMD HD8990 (dual-core)
> RAM: 32GB
>
> Machine2=
> CPU: 6-core AMD FX6120
> GPU: AMD R9 Nano (single-core)
> RAM: 8GB
>
> Machines are networked by LAN, which cat I don't remember (may be reason
> for slow mdrun described soon?). I have added so many $PATH variables on
> and off- I'm tired of it. I do have to ask if I have to add
> usr/local/gromacs/bin to the $PATH and $LD_PATH in .bashrc? When I did this
> is when the machines started picking up on eachother, as they would not
> when only OMPI was specified in .bashrc. What's going on here?
>

Could it be that your assumptions about the way the MPI launcher starts a
binary are not correct?
The MPI launcher has to know where to get the binaries from on all hosts
invoved. You can either
i) have gmx in the path and launch by just passing "gmx" to mpirun or
ii) pass "/full/path/to/gmx" to mpirun that's correct for all nodes; this
can be achieved by using a shared file system or by having the binary in at
the same location on both machines (or alternatively you can even use a
wrapper script on both machines that hard-codes a custom path).


> My issue- besides understanding it all, this is the point where I've got
> things somewhat working- is that I run something like "mpirun --hosts
> Machine1,Machine2 --path usr/local/gromacs/bin gmx mdrun -v -deffnm nvt"
> from Machine2 over ssh from separate laptop, everything seems to be going
> smoothly... except this is only being run on Machine1(8990 is also
> detected), with Machine2 components not being detected. Essentially, I'm
> running gmx mdrun on Machine1 as if I was using ssh from Machine2.
>

You need to show us the data that led you to this conclusion. My guess -
and I can again only guess - is that you may have misunderstood the log
output which displays detailed detection for the host where rank #0 runs,
but since 5.1 it also displays a summary (i.e. in your case it should say
that the run has detected three GPUs).


> Obviously, this isn't what I want to do. I'd like to see each machine with
> their respective GPUs being utilized for gmx mdrun and the such.
>
> If Machine1 is actually being utilized, then why is it performing steps at
> a much slower rate than if I had used the same machine from ssh?
>

As Mark suggests, my guess too would be that your setup (hardware or
simulation) is just not suitable for multi-node runs.

Show us log files, we'll likely be able to tell much more.

BTW, you can also simply shh into the machines in question after launching
the run and check what's happening with e.g. top/htop?

--
Szilárd


To top off this issue, when I run gmx grompp (or genion) it freezes before
> bringing me back to the command line.
>
> Obviously, I'm new, so I want to say that I really appreciate any help
> here. Learning GROMACS has been a pleasure so far and I'm hoping to get
> details down to utilize for homology modeling. Thanks for your help
>
> -Jacob
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>


More information about the gromacs.org_gmx-users mailing list