[gmx-developers] getting gromacs to run on ORNL summitdev
Szilárd Páll
pall.szilard at gmail.com
Fri May 19 16:53:04 CEST 2017
Have you tried the *latest* release?
--
Szilárd
On Fri, May 19, 2017 at 4:51 PM, Sedova, Ada A. <sedovaaa at ornl.gov> wrote:
> Szilárd,
>
> Ok, thanks. I was talking about the SIMD=VMX vs VSX directives, and yes I know it is ok not to specify, but we would like the maximum performance. But this is not the important issue, this is superfluous to the actual question asked; I was just trying to show you that the build seemed to be successful.
>
> *The main problem I am writing about is the CudaMalloc error I get when trying to run more than 4 mpi ranks. Please see the later sections of the original email.*
>
> Thanks so much,
> Ada
>
>
>
> Ada Sedova
> Postdoctoral Research Associate
> Scientific Computing Group, NCCS
> Oak Ridge National Laboratory, Oak Ridge, TN
>
> ________________________________________
> From: gromacs.org_gmx-developers-bounces at maillist.sys.kth.se <gromacs.org_gmx-developers-bounces at maillist.sys.kth.se> on behalf of Szilárd Páll <pall.szilard at gmail.com>
> Sent: Friday, May 19, 2017 10:33 AM
> To: Discussion list for GROMACS development
> Subject: Re: [gmx-developers] getting gromacs to run on ORNL summitdev
>
> Hi Ada,
>
> Firstly of all, please use the latest release, v2016 has been tested
> quite well on IBM Minsky nodes.
>
> Not sure which altivec option are you referring to, but none should be
> necessary, the build system should generate correct and sufficient
> compiler options for decent performance (unless of course you want to
> tune them further).
>
> NVML is only needed if you want reporting of the clocks or if you want
> to allow mdrun to change these -- which won't work as AFAIK NVIDIA
> does not allow user-space process to change clocks on the P100 (and
> later?).
> make check should also work, but it's easier of you build without
> regressiontests enabled in CMake and run the regressiontests
> separately. That way you can specify the MPI launcher, custom binary
> suffix etc, e.g.
> perl gmxtest.pl -mpirun mpirun -np N -ntomp M all
>
>
> Cheers,
> --
> Szilárd
>
>
> On Fri, May 19, 2017 at 3:45 PM, Sedova, Ada A. <sedovaaa at ornl.gov> wrote:
>>
>> Hi folks,
>>
>>
>> I am a member of the Scientific Computing group at Oak Ridge National Lab's
>> National Center for Computational Sciences (NCCS). We are preparing high
>> performance codes for public use on our new Summit system by testing on the
>> prototype SummitDev:
>>
>>
>> The Summitdev system is an early access system that is one generation
>> removed from OLCF’s next big supercomputer, Summit. The system contains
>> three racks, each with 18 IBM POWER8 S822LC nodes for a total of 54 nodes.
>> Each IBM S822LC node has 2 IBM POWER8 CPUs and 4 NVIDIA Tesla P100 GPUs. The
>> POWER8 CPUs have 32 8GB DDR4 memory (256 GB). Each POWER8 node has 10 cores
>> with 8 HW threads. The GPUs are connected by NVLink 1.0 at 80GB/s and each
>> GPU has 16GB HBM2 memory. The nodes are connected in a full fat-tree via EDR
>> InfiniBand. The racks are liquid cooled with a heat exchanger rate.
>> Summitdev has access to Spider 2, the OLCF’s center-wide Lustre parallel
>> file system.
>>
>>
>> I have built gromacs 5.1.4., seemingly successfully, using GPU, MPI, SIMD
>> options with CMake directives (I didn't get the alvtivec option completely
>> right, but that should be ok). The only things that were not found during
>> the cmake were the NVML library (we do not have the Deployment package
>> installed), LAPACK (although it did find essl after I set the BLAS
>> directive), and some includes like io.h.
>>
>>
>> I cannot use the make check because on summitdev I have to launch gmx_mpi
>> with mpirun, and I think the make check does not call gmx that way. I have
>> manually tested the build tools (pdb2gmx, editconf, grommp, solvate), and
>> they work fine.
>>
>>
>> However, I cannot get mdrun to work even on one node on a small box of
>> water. After playing with the number of processes so that the openMP threads
>> message stopped crying, I get the following:
>>
>>
>>>>cudaMallocHost of size 1024128 bytes failed: all CUDA-capable devices are
>>>> busy or unavailable
>>
>> Here is the complete context for this error:
>>
>>
>> ****************************************************************************************************
>>
>> bash-4.2$ mpirun -np 80 /ccs/home/adaa/gromacs/bin/gmx_mpi mdrun -v -deffnm
>> water_em
>>
>> :-) GROMACS - gmx mdrun, VERSION 5.1.4 (-:
>> GROMACS is written by:
>> Emile Apol Rossen Apostolov Herman J.C. Berendsen Par Bjelkmar
>> Aldert van Buuren Rudi van Drunen Anton Feenstra Sebastian Fritsch
>> Gerrit Groenhof Christoph Junghans Anca Hamuraru Vincent Hindriksen
>> Dimitrios Karkoulis Peter Kasson Jiri Kraus Carsten Kutzner
>> Per Larsson Justin A. Lemkul Magnus Lundborg Pieter Meulenhoff
>> Erik Marklund Teemu Murtola Szilard Pall Sander Pronk
>> Roland Schulz Alexey Shvetsov Michael Shirts Alfons Sijbers
>> Peter Tieleman Teemu Virolainen Christian Wennberg Maarten Wolf
>> and the project leaders:
>> Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel
>> Copyright (c) 1991-2000, University of Groningen, The Netherlands.
>> Copyright (c) 2001-2015, The GROMACS development team at
>> Uppsala University, Stockholm University and
>> the Royal Institute of Technology, Sweden.
>> check out http://www.gromacs.org for more information.
>> GROMACS is free software; you can redistribute it and/or modify it
>> under the terms of the GNU Lesser General Public License
>> as published by the Free Software Foundation; either version 2.1
>> of the License, or (at your option) any later version.
>> GROMACS: gmx mdrun, VERSION 5.1.4
>> Executable: /ccs/home/adaa/gromacs/bin/gmx_mpi
>> Data prefix: /ccs/home/adaa/gromacs
>> Command line:
>> gmx_mpi mdrun -v -deffnm water_em
>>
>>
>> Back Off! I just backed up water_em.log to ./#water_em.log.15#
>> Number of logical cores detected (160) does not match the number reported by
>> OpenMP (80).
>> Consider setting the launch configuration manually!
>> Running on 4 nodes with total 640 logical cores, 16 compatible GPUs
>> Logical cores per node: 160
>> Compatible GPUs per node: 4
>> All nodes have identical type(s) of GPUs
>> Hardware detected on host summitdev-r0c1n04 (the node of MPI rank 0):
>> CPU info:
>> Vendor: IBM
>> Brand: POWER8NVL (raw), altivec supported
>> SIMD instructions most likely to fit this hardware: IBM_VSX
>> SIMD instructions selected at GROMACS compile time: IBM_VMX
>> GPU info:
>> Number of GPUs detected: 4
>> #0: NVIDIA Tesla P100-SXM2-16GB, compute cap.: 6.0, ECC: yes, stat:
>> compatible
>> #1: NVIDIA Tesla P100-SXM2-16GB, compute cap.: 6.0, ECC: yes, stat:
>> compatible
>> #2: NVIDIA Tesla P100-SXM2-16GB, compute cap.: 6.0, ECC: yes, stat:
>> compatible
>> #3: NVIDIA Tesla P100-SXM2-16GB, compute cap.: 6.0, ECC: yes, stat:
>> compatible
>> Compiled SIMD instructions: IBM_VMX, GROMACS could use IBM_VSX on this
>> machine, which is better
>> Reading file water_em.tpr, VERSION 5.1.4 (single precision)
>> Using 80 MPI processes
>> Using 8 OpenMP threads per MPI process
>> On host summitdev-r0c1n04 4 compatible GPUs are present, with IDs 0,1,2,3
>> On host summitdev-r0c1n04 4 GPUs auto-selected for this run.
>> Mapping of GPU IDs to the 20 PP ranks in this node:
>> 0,0,0,0,0,1,1,1,1,1,2,2,2,2,2,3,3,3,3,3
>>
>> NOTE: GROMACS was configured without NVML support hence it can not exploit
>> application clocks of the detected Tesla P100-SXM2-16GB GPU to improve
>> performance.
>> Recompile with the NVML library (compatible with the driver used) or
>> set application clocks manually.
>>
>> -------------------------------------------------------
>> Program gmx mdrun, VERSION 5.1.4
>> Source code file:
>> /ccs/home/adaa/gromacs-5.1.4/src/gromacs/gmxlib/cuda_tools/pmalloc_cuda.cu,
>> line: 70
>> Fatal error:
>> cudaMallocHost of size 1024128 bytes failed: all CUDA-capable devices are
>> busy or unavailable
>>
>>
>>
>> ****************************************************************************************************
>>
>> Thanks for any help you can offer.
>>
>>
>> Best,
>>
>>
>> Ada
>>
>>
>>
>>
>> Ada Sedova
>>
>> Postdoctoral Research Associate
>>
>> Scientific Computing Group, NCCS
>>
>> Oak Ridge National Laboratory, Oak Ridge, TN
>>
>>
>> --
>> Gromacs Developers mailing list
>>
>> * Please search the archive at
>> http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List before
>> posting!
>>
>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>
>> * For (un)subscribe requests visit
>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers or
>> send a mail to gmx-developers-request at gromacs.org.
> --
> Gromacs Developers mailing list
>
> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers or send a mail to gmx-developers-request at gromacs.org.
> --
> Gromacs Developers mailing list
>
> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers or send a mail to gmx-developers-request at gromacs.org.
More information about the gromacs.org_gmx-developers
mailing list