[gmx-users] Gromacs 4.6.7 with MPI and OpenMP
pall.szilard at gmail.com
Fri May 8 15:56:14 CEST 2015
On Fri, May 8, 2015 at 2:50 PM, Malcolm Tobias <mtobias at wustl.edu> wrote:
> Hi Mark,
> On Friday 08 May 2015 11:51:03 Mark Abraham wrote:
>> > I'm attempting to build gromacs on a new cluster and following the same
>> > recipies that I've used in the past, but encountering a strange behavior:
>> > It claims to be using both MPI and OpenMP, but I can see by 'top' and the
>> > reported core/walltime that it's really only generating the MPI processes
>> > and no threads.
>> I wouldn't take the output from top completely at face value. Do you get
>> the same performance from -ntomp 1 as -ntomp 4?
> I'm not relying on top. I also mentioned that the core/walltime as reported by Gromacs suggests that it's only utilizing 2 cores. I've also been comparing the performance to an older cluster.
What's being utilized vs what's being started are different things. If
you don't believe the mdrun output - which is quite likely not wrong
about the 2 ranks x 4 threads -, use your favorite tool to check the
number of ranks and threads started and their placement. That will
explain what's going on...
>> > We're running a hetergenous environment, so I tend to build with
>> > MPI/OpenMP/CUDA and the Intel compilers, but I'm seeing this same sort of
>> > behavior with the GNU compilers. Here's how I'm configuring things:
>> > [root at login01 build2]# cmake -DGMX_FFT_LIBRARY=mkl -DGMX_MPI=ON
>> > -DGMX_GPU=ON -DCUDA_TOOLKIT_ROOT_DIR=/opt/cuda -DGMX_OPENMP=ON
>> > -DCMAKE_INSTALL_PREFIX=/act/gromacs-4.6.7_take2 .. | tee cmake.out
>> You need root access for "make install." Period.
> Yes Mark, I ran 'make install' as root.
>> Routinely using root means you've probably hosed your system some time...
> In 20+ years of managing Unix systems I've managed to hose many a system.
>> > Using 2 MPI processes
>> > Using 4 OpenMP threads per MPI process
>> > although I do see this warning:
>> > Number of CPUs detected (16) does not match the number reported by OpenMP
>> > (1).
>> Yeah, that happens. There's not really a well-defined standard, so once the
>> OS, MPI and OpenMP libraries all combine, things can get messy.
> Understood. On top of that we're using CPUSETs with our queuing system which can interfere with how the tasks are distributed. I've tried running the job outside of the queuing system and have seen the same behavior.
Very likely that's exactly what's screwing things up. We try to be
nice and back off (mdrun should note that on the output) when
affinities are set externally assuming that they are set for a good
reason and to correct values. Sadly, that assumption often proves to
be wrong. Try running with "-pin on" or turn off the CPUSET-ing (or
double-check if it's right).
>> But if people go around using root routinely... ;-)
> As soon as I figure out how to manage a computing cluster without becoming root I'll let you know ;-)
> I've got dozens of Gromacs users, so I'm attempting to build the fastest, most versatile binary that I can. Any help that people can offer is certainly appreciated.
Post logs and std outputs, please. That will allow us to check things
rather than guess.
> Malcolm Tobias
> Gromacs Users mailing list
> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.
More information about the gromacs.org_gmx-users