[gmx-users] GPU job failed
Carsten Kutzner
ckutzne at gwdg.de
Tue Sep 9 09:16:16 CEST 2014
Hi,
from the double output it looks like two identical mdruns,
each with 1 PP process and 10 OpenMP threads, are started.
Maybe there is something wrong with your MPI setup (did
you by mistake compile with thread-MPI instead of MPI?)
Carsten
On 09 Sep 2014, at 09:06, Albert <mailmd2011 at gmail.com> wrote:
> Here are more informations from log file:
>
> mpirun -np 2 mdrun_mpi -v -s npt2.tpr -c npt2.gro -x npt2.xtc -g
> npt2.log -gpu_id 01 -ntomp 0
>
>
> Number of hardware threads detected (20) does not match the number
> reported by OpenMP (10).
> Consider setting the launch configuration manually!
>
> Number of hardware threads detected (20) does not match the number
> reported by OpenMP (10).
> Consider setting the launch configuration manually!
> Reading file npt2.tpr, VERSION 5.0.1 (single precision)
> Reading file npt2.tpr, VERSION 5.0.1 (single precision)
> Using 1 MPI process
> Using 10 OpenMP threads
>
> 2 GPUs detected on host cudaB:
> #0: NVIDIA GeForce GTX 780 Ti, compute cap.: 3.5, ECC: no, stat: compatible
> #1: NVIDIA GeForce GTX 780 Ti, compute cap.: 3.5, ECC: no, stat: compatible
>
> 2 GPUs user-selected for this run.
> Mapping of GPUs to the 1 PP rank in this node: #0, #1
>
>
> -------------------------------------------------------
> Program mdrun_mpi, VERSION 5.0.1
> Source code file:
> /soft2/plumed-2.2/gromacs-5.0.1/src/gromacs/gmxlib/gmx_detect_hardware.c, line:
> 359
>
> Fatal error:
> Incorrect launch configuration: mismatching number of PP MPI processes
> and GPUs per node.
> mdrun_mpi was started with 1 PP MPI process per node, but you provided 2
> GPUs.
> For more information and tips for troubleshooting, please check the GROMACS
> website at http://www.gromacs.org/Documentation/Errors
> -------------------------------------------------------
>
> Halting program mdrun_mpi
>
> gcq#314: "Do You Have Sex Maniacs or Schizophrenics or Astrophysicists
> in Your Family?" (Gogol Bordello)
>
> --------------------------------------------------------------------------
> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
> with errorcode -1.
>
> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> You may or may not see output from other processes, depending on
> exactly when Open MPI kills them.
> --------------------------------------------------------------------------
> Using 1 MPI process
> Using 10 OpenMP threads
>
> 2 GPUs detected on host cudaB:
> #0: NVIDIA GeForce GTX 780 Ti, compute cap.: 3.5, ECC: no, stat: compatible
> #1: NVIDIA GeForce GTX 780 Ti, compute cap.: 3.5, ECC: no, stat: compatible
>
> 2 GPUs user-selected for this run.
> Mapping of GPUs to the 1 PP rank in this node: #0, #1
>
>
> -------------------------------------------------------
> Program mdrun_mpi, VERSION 5.0.1
> Source code file:
> /soft2/plumed-2.2/gromacs-5.0.1/src/gromacs/gmxlib/gmx_detect_hardware.c, line:
> 359
>
> Fatal error:
> Incorrect launch configuration: mismatching number of PP MPI processes
> and GPUs per node.
> mdrun_mpi was started with 1 PP MPI process per node, but you provided 2
> GPUs.
> For more information and tips for troubleshooting, please check the GROMACS
> website at http://www.gromacs.org/Documentation/Errors
> -------------------------------------------------------
>
> Halting program mdrun_mpi
>
> gcq#56: "Lunatics On Pogo Sticks" (Red Hot Chili Peppers)
>
> --------------------------------------------------------------------------
> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
> with errorcode -1.
>
> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> You may or may not see output from other processes, depending on
> exactly when Open MPI kills them.
> --------------------------------------------------------------------------
>
>
>
>
>
>
>
>
>
> On 09/08/2014 11:59 PM, Yunlong Liu wrote:
>> Same idea with Szilard.
>>
>> How many nodes are you using?
>> On one nodes, how many MPI ranks do you have? The error is complaining about you assigned two GPUs to only one MPI process on one node. If you spread your two MPI ranks on two nodes, that means you only have one at each. Then you can't assign two GPU for only one MPI rank.
>>
>> How many GPU do you have on one node? If there are two, you can either launch two PPMPI processes on one node and assign two GPU for them. If you only want to launch one MPI rank on each node, you can assign only one GPU for each node ( by -gpu_id 0 )
>>
>> Yunlong
>
> --
> Gromacs Users mailing list
>
> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.
--
Dr. Carsten Kutzner
Max Planck Institute for Biophysical Chemistry
Theoretical and Computational Biophysics
Am Fassberg 11, 37077 Goettingen, Germany
Tel. +49-551-2012313, Fax: +49-551-2012302
http://www.mpibpc.mpg.de/grubmueller/kutzner
http://www.mpibpc.mpg.de/grubmueller/sppexa
More information about the gromacs.org_gmx-users
mailing list