[gmx-users] MPI GPU job failed
Albert
mailmd2011 at gmail.com
Thu Aug 11 15:37:27 CEST 2016
Here is what I got for command:
mpirun -np 2 gmx_mpi mdrun -v -s 62.tpr -gpu_id 0
It seems that it still used 1 GPU instead of 2. I don't understand why.....
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
Running on 1 node with total 10 cores, 20 logical cores, 2 compatible GPUs
Hardware detected on host cudaB (the node of MPI rank 0):
CPU info:
Vendor: GenuineIntel
Brand: Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz
SIMD instructions most likely to fit this hardware: AVX_256
SIMD instructions selected at GROMACS compile time: AVX_256
GPU info:
Number of GPUs detected: 2
#0: NVIDIA GeForce GTX 780 Ti, compute cap.: 3.5, ECC: no, stat:
compatible
#1: NVIDIA GeForce GTX 780 Ti, compute cap.: 3.5, ECC: no, stat:
compatible
Reading file 62.tpr, VERSION 5.1.3 (single precision)
Reading file 62.tpr, VERSION 5.1.3 (single precision)
Using 1 MPI process
Using 20 OpenMP threads
1 GPU user-selected for this run.
Mapping of GPU ID to the 1 PP rank in this node: 0
Using 1 MPI process
Using 20 OpenMP threads
1 GPU user-selected for this run.
Mapping of GPU ID to the 1 PP rank in this node: 0
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
Here is what I got for command:
mpirun -np 2 gmx_mpi mdrun -ntomp 10 -v -s 62.tpr -gpu_id 01
It stilled failed.................
-------------------------------------------------------
Program gmx mdrun, VERSION 5.1.3
Source code file:
/home/albert/Downloads/gromacs/gromacs-5.1.3/src/gromacs/gmxlib/gmx_detect_hardware.cpp,
line: 458
Fatal error:
Incorrect launch configuration: mismatching number of PP MPI processes
and GPUs per node.
gmx_mpi was started with 1 PP MPI process per node, but you provided 2 GPUs.
For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors
-------------------------------------------------------
Halting program gmx mdrun
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
Using 1 MPI process
Using 10 OpenMP threads
2 GPUs user-selected for this run.
Mapping of GPU IDs to the 1 PP rank in this node: 0,1
-------------------------------------------------------
On 08/11/2016 03:33 PM, Justin Lemkul wrote:
> So you're trying to run on two nodes, each of which has one GPU? I
> haven't done such a run, but perhaps mpirun -np 2 gmx_mpi mdrun -v -s
> 62.tpr -gpu_id 0 would do the trick, by finding the first GPU on each
> node?
>
> -Justin
More information about the gromacs.org_gmx-users
mailing list