[gmx-users] Can only find GPU'S on first node

Smith, Micholas D. smithmd at ornl.gov
Fri Feb 13 14:35:09 CET 2015

Have you tried using:

 mpirun -np (number of mpi-processes total)  -npernode (number of gpus per node) .

when you execute mdrun? Not sure if it will do the trick, but it looks like it may work. What that command should do it limit the number of mpi-process to the number of gpus (so that each node uses all of the gpus, one per mpi process). 


From: gromacs.org_gmx-users-bounces at maillist.sys.kth.se <gromacs.org_gmx-users-bounces at maillist.sys.kth.se> on behalf of Barnett, James W <jbarnet4 at tulane.edu>
Sent: Thursday, February 12, 2015 2:32 PM
To: gromacs.org_gmx-users at maillist.sys.kth.se
Subject: [gmx-users] Can only find GPU'S on first node

I'm having difficulty using GPU's across multiple nodes. I'm using OpenMPI to run GROMACS (5.0.4) across multiple nodes. Each node has 20 cpu cores and 2 GPU's. When I try to run GROMACS across multiple nodes (2 in this case) it only detects the cpu cores and GPU's from the first node.

Running GROMACS on 1 node with OpenMPI and utilizing the 2 GPUS works fine. Additionally, running GROMACS on multiple nodes with OpenMPI and setting -nb to cpu also works fine, as GROMACS utilizes all cpu cores in that case. It's just when running with GPU's across multiple nodes where I have the problem.

At the top of my PBS log file I see that two nodes are allocated for it:

PBS has allocated the following nodes:


A total of 40 processors on 2 nodes allocated

However GROMACS gives the following warning and error indicating it has only found 20 cpu cores and 2 GPU's:

Using 4 MPI processes
Using 10 OpenMP threads per MPI process

WARNING: Oversubscribing the available 20 logical CPU cores with 40 threads.
         This will cause considerable performance loss!

2 GPUs detected on host qb140:
  #0: NVIDIA Tesla K20Xm, compute cap.: 3.5, ECC: yes, stat: compatible
  #1: NVIDIA Tesla K20Xm, compute cap.: 3.5, ECC: yes, stat: compatible

2 GPUs user-selected for this run.
Mapping of GPUs to the 4 PP ranks in this node: #0, #1

Program mdrun, VERSION 5.0.4
Source code file: /home/wes/gromacs-5.0.4/src/gromacs/gmxlib/gmx_detect_hardware.c, line: 359

Fatal error:
Incorrect launch configuration: mismatching number of PP MPI processes and GPUs per node.
mdrun was started with 4 PP MPI processes per node, but you provided 2 GPUs.
For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors

Here is my run command I have been using, where I try to indicate to use 2 pp ranks per node (in this case 2 nodes):

mdrun_command=$(which mdrun)
mpirun_command=$(which mpirun)

$mpirun_command -np 4 -x LD_LIBRARY_PATH -v -hostfile $PBS_NODEFILE $mdrun_command -deffnm eqlA$i

I've tried to follow this article in running with GPU's: http://www.gromacs.org/Documentation/Acceleration_and_parallelization#Heterogenous_parallelization.3a_using_GPUs

Again, running with one node works fine:

$mpirun_command -np 2 -x LD_LIBRARY_PATH -v -hostfile $PBS_NODEFILE $mdrun_command -deffnm eqlA$

Running across multiple nodes specifying not to use GPU's also works fine:

$mpirun_command -np 40 -x LD_LIBRARY_PATH -v -hostfile $PBS_NODEFILE $mdrun_command -nb cpu -deffnm eqlA$

Thanks again for any advice or direction you can give on this.

James "Wes" Barnett

Ph.D. Candidate

Chemical and Biomolecular Engineering

Tulane University

Boggs Center for Energy and Biotechnology, Room 341-B
Gromacs Users mailing list

* Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.

More information about the gromacs.org_gmx-users mailing list