[gmx-users] MPI GPU job failed
Justin Lemkul
jalemkul at vt.edu
Thu Aug 11 15:39:39 CEST 2016
On 8/11/16 9:37 AM, Albert wrote:
> Here is what I got for command:
> mpirun -np 2 gmx_mpi mdrun -v -s 62.tpr -gpu_id 0
> It seems that it still used 1 GPU instead of 2. I don't understand why.....
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Running on 1 node with total 10 cores, 20 logical cores, 2 compatible GPUs
Then this is inconsistent with my first question in the last reply. You have
two GPU on a single, physical node. For this, you should not need an external
gmx mdrun -ntmpi 2 -v -s 62.tpr -gpu_id 01
> Hardware detected on host cudaB (the node of MPI rank 0):
> CPU info:
> Vendor: GenuineIntel
> Brand: Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz
> SIMD instructions most likely to fit this hardware: AVX_256
> SIMD instructions selected at GROMACS compile time: AVX_256
> GPU info:
> Number of GPUs detected: 2
> #0: NVIDIA GeForce GTX 780 Ti, compute cap.: 3.5, ECC: no, stat: compatible
> #1: NVIDIA GeForce GTX 780 Ti, compute cap.: 3.5, ECC: no, stat: compatible
> Reading file 62.tpr, VERSION 5.1.3 (single precision)
> Reading file 62.tpr, VERSION 5.1.3 (single precision)
> Using 1 MPI process
> Using 20 OpenMP threads
> 1 GPU user-selected for this run.
> Mapping of GPU ID to the 1 PP rank in this node: 0
> Using 1 MPI process
> Using 20 OpenMP threads
> 1 GPU user-selected for this run.
> Mapping of GPU ID to the 1 PP rank in this node: 0
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Here is what I got for command:
> mpirun -np 2 gmx_mpi mdrun -ntomp 10 -v -s 62.tpr -gpu_id 01
> It stilled failed.................
> -------------------------------------------------------
> Program gmx mdrun, VERSION 5.1.3
> Source code file:
> /home/albert/Downloads/gromacs/gromacs-5.1.3/src/gromacs/gmxlib/gmx_detect_hardware.cpp,
> line: 458
> Fatal error:
> Incorrect launch configuration: mismatching number of PP MPI processes and GPUs
> per node.
> gmx_mpi was started with 1 PP MPI process per node, but you provided 2 GPUs.
> For more information and tips for troubleshooting, please check the GROMACS
> website at http://www.gromacs.org/Documentation/Errors
> -------------------------------------------------------
> Halting program gmx mdrun
> --------------------------------------------------------------------------
> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
> with errorcode 1.
> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> You may or may not see output from other processes, depending on
> exactly when Open MPI kills them.
> --------------------------------------------------------------------------
> Using 1 MPI process
> Using 10 OpenMP threads
> 2 GPUs user-selected for this run.
> Mapping of GPU IDs to the 1 PP rank in this node: 0,1
> -------------------------------------------------------
> On 08/11/2016 03:33 PM, Justin Lemkul wrote:
>> So you're trying to run on two nodes, each of which has one GPU? I haven't
>> done such a run, but perhaps mpirun -np 2 gmx_mpi mdrun -v -s 62.tpr -gpu_id 0
>> would do the trick, by finding the first GPU on each node?
>> -Justin
Justin A. Lemkul, Ph.D.
Ruth L. Kirschstein NRSA Postdoctoral Fellow
Department of Pharmaceutical Sciences
School of Pharmacy
Health Sciences Facility II, Room 629
University of Maryland, Baltimore
20 Penn St.
Baltimore, MD 21201
jalemkul at outerbanks.umaryland.edu | (410) 706-7441
More information about the gromacs.org_gmx-users
mailing list