[gmx-users] slurm, gres:gpu, only 1 GPU out of 4 is detected

Mark Abraham mark.j.abraham at gmail.com
Thu Nov 14 19:32:19 CET 2019


Hi,

gmx can only detect devices that are visible to it. Your use of slurm is
making only one device visible, so gmx can't understand what you mean with
-gpu_id 1. But you don't need to manage the same thing twice. If gmx can
only see one device and --gres won't allocate a previously allocated gpu,
then you have no need to use -gpu_id.

Mark

On Wed, 13 Nov 2019 at 19:21, Tamas Hegedus <tamas at hegelab.org> wrote:

> I had the misconception that I have to set gpuid by CUDA_VISIBLE_DEVICES
> set by slurm.
> However, slurm exposes the gpu for gromacs by a different mechanism.
>
> On 11/13/19 4:55 PM, Tamas Hegedus wrote:
> > Hi,
> >
> > I run gmx 2019 using GPU
> > There are 4 GPUs in my GPU hosts.
> > I have slurm and configured gres=gpu
> >
> > 1. If I submit a job with --gres=gpu:1 then GPU#0 is identified and
> > used (-gpu_id $CUDA_VISIBLE_DEVICES).
> > 2. If I submit a second job, it fails: the $CUDA_VISIBLE_DEVICES is 1
> > and selected, but GPU #0 is identified by gmx as a compatible gpu.
> > From the output:
> >
> > gmx mdrun -v -pin on -deffnm equi_nvt -nt 8 -gpu_id 1 -nb gpu -pme gpu
> > -npme 1 -ntmpi 4
> >
> >   GPU info:
> >     Number of GPUs detected: 1
> >     #0: NVIDIA GeForce GTX 1080 Ti, compute cap.: 6.1, ECC:  no, stat:
> > compatible
> >
> > Fatal error:
> > You limited the set of compatible GPUs to a set that included ID #1,
> > but that
> > ID is not for a compatible GPU. List only compatible GPUs.
> >
> > 3. If I login to that node and run the mdrun command written into the
> > output in the previous step then it selects the right gpu and runs as
> > expected.
> >
> > $CUDA_DEVICE_ORDER is set to PCI_BUS_ID
> >
> > I can not decide if this is a slurm config error or something with
> > gromacs, as $CUDA_VISIBLE_DEVICES is set correctly by slurm and I
> > expect gromacs to detect all 4GPUs.
> >
> > Thanks for your help and suggestions,
> > Tamas
> >
>
> --
> Tamas Hegedus, PhD
> Senior Research Fellow
> Department of Biophysics and Radiation Biology
> Semmelweis University     | phone: (36) 1-459 1500/60233
> Tuzolto utca 37-47        | mailto:tamas at hegelab.org
> Budapest, 1094, Hungary   | http://www.hegelab.org
>
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.


More information about the gromacs.org_gmx-users mailing list