[gmx-users] Gromacs-4.6.6 + openmpi 1.6: no output running in 2 nodes
Elton Carvalho
eltonfc at gmail.com
Sat Aug 16 02:02:46 CEST 2014
On Fri, Aug 15, 2014 at 7:28 PM, Mark Abraham <mark.j.abraham at gmail.com> wrote:
> Hi,
>
> This kind of thing suggests a problem with the working directory at run
> time being either not what you think it is, or pointing at a file system
> that isn't working properly.
That's weird, because the only difference between the jobs scripts for
1 or 2 nodes is the
#PBS -l nodes=1:ppn=8
line that is changed accordingly. With nodes=1 mdrun runs as expected
in 8 cores. With nodes=2, I get no output.
> Try some
>
> echo $PWD
>
> statements in your job script, and/or run another MPI test program to
> gather some data.
>
The echo $PWD is already there and shows what I expected. Also, an mpi
hello-world program[1] runs OK in more than 1 node. I could try to add
some file handling to the code and see if LUSTRE is misbehaving.
Also, what I mean by no output includes writing to stdout/stderr. When
I run mdrun -v, there's nothing in the stderr or stdout of the job,
even though that mpi hello-world successfully writes to stdout. For
this reason I don't believe it's a filesystem issue.
Any other ideas? I'm puzzled. Thanks!
[1] http://mpitutorial.com/mpi-hello-world/
--
Elton Carvalho
> On Fri, Aug 15, 2014 at 4:22 PM, Elton Carvalho <eltonfc at gmail.com> wrote:
>
>> Good evening, fellow users!
>>
>> I'm trying to build gromacs 4.6.6 at an university cluster, which has
>> openmpi 1.6 and intel Coposer XE 2013.
>>
>> When I run it on one node, with 8 cpus, mdrun behaves as expected.
>> When I trun it in more than one node, 8 mdrun processes are spawned in
>> each node using 100% CPU, but no output files are written. I get no
>> output from gromacs at all.
>>
>> Any ideas on how to diagnose it? For completeness, the CMake
>> Cache file is here: http://pastebin.com/p5QaKpef
>> I wonder if I missed something on the mpi libraries.
>>
>> I tried building mdrun with openMP off, relying only in MPI and the
>> behavior is the same.
>>
>> Thanks in advance!
>>
>> --
>> Elton Carvalho
More information about the gromacs.org_gmx-users
mailing list