[gmx-users] Errors on running simulation in Cluster
dammar badu
dnbadu069 at gmail.com
Mon Jun 27 21:43:19 CEST 2016
Dear Gromacacs users,
I have been doing simulation of protein in a membrane in Coarse Grained
Model
with total number of atom 13372 where around 14350 water molecules. I have
run the
simulation for 100ns with dt = 0.02 and nsteps =5000000 and is OK running
in a
single computer of 16 cores and single GPU. But I got stopped after 20
second from strat and got following error while
runnig in a cluster with multiple GPU keeping everything same.
GROMACS: gmx mdrun, VERSION 5.1.2
Executable: /sharedapps/ICC/15.2/gromacs/5.1.2_gpu/bin/gmx_mpi
Data prefix: /sharedapps/ICC/15.2/gromacs/5.1.2_gpu
Command line:
gmx_mpi mdrun 1600ns_md -dlb auto -v -s 1600ns_md.tpr -npme 1
Back Off! I just backed up md.log to ./#md.log.10#
40 CPUs configured, but only 20 of them are online.
This can happen on embedded platforms (e.g. ARM) where the OS shuts some
cores
off to save power, and will turn them back on later when the load increases.
However, this will likely mean GROMACS cannot pin threads to those cores.
You
will likely see much better performance by forcing all cores to be online,
and
making sure they run at their full clock frequency.
Number of logical cores detected (40) does not match the number reported by
OpenMP (10).
Consider setting the launch configuration manually!
Running on 1 node with total 20 cores, 40 logical cores, 4 compatible GPUs
Hardware detected on host compute-gpu-01 (the node of MPI rank 0):
CPU info:
Vendor: GenuineIntel
Brand: Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz
SIMD instructions most likely to fit this hardware: AVX2_256
SIMD instructions selected at GROMACS compile time: AVX2_256
GPU info:
Number of GPUs detected: 4
#0: NVIDIA Tesla K80, compute cap.: 3.7, ECC: yes, stat: compatible
#1: NVIDIA Tesla K80, compute cap.: 3.7, ECC: yes, stat: compatible
#2: NVIDIA Tesla K80, compute cap.: 3.7, ECC: yes, stat: compatible
#3: NVIDIA Tesla K80, compute cap.: 3.7, ECC: yes, stat: compatible
Reading file 1600ns_md.tpr, VERSION 5.1.1 (single precision)
Changing nstlist from 10 to 25, rlist from 1.308 to 1.408
The number of OpenMP threads was set by environment variable
OMP_NUM_THREADS to 4
Using 4 MPI processes
Using 4 OpenMP threads per MPI process
On host compute-gpu-01 4 compatible GPUs are present, with IDs 0,1,2,3
On host compute-gpu-01 3 GPUs auto-selected for this run.
Mapping of GPU IDs to the 3 PP ranks in this node: 0,1,2
NOTE: potentially sub-optimal launch configuration, gmx_mpi started with
less
PP MPI processes per node than GPUs available.
Each PP MPI process can use only one GPU, 3 GPUs per node will be
used.
NOTE: GROMACS was configured without NVML support hence it can not exploit
application clocks of the detected Tesla K80 GPU to improve
performance.
Recompile with the NVML library (compatible with the driver used) or
set application clocks manually.
Non-default thread affinity set probably by the OpenMP library,
disabling internal thread affinity
Back Off! I just backed up traj_comp.xtc to ./#traj_comp.xtc.10#
Back Off! I just backed up ener.edr to ./#ener.edr.10#
NOTE: DLB will not turn on during the first phase of PME tuning
starting mdrun 'Martini system from folded_ligand_75copies.pdb'
5000000 steps, 100000.0 ps.
step 0
[compute-gpu-01:56224] *** Process received signal ***
[compute-gpu-01:56224] Signal: Segmentation fault (11)
[compute-gpu-01:56224] Signal code: Address not mapped (1)
[compute-gpu-01:56224] Failing at address: 0xfcb649f4
--------------------------------------------------------------------------
mpirun noticed that process rank 3 with PID 56224 on node compute-gpu-01
exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
I am using following input parameters for the simulations
define = -DPOSRES_2 -DPOSRES_lipid
dt = 0.02
nsteps = 5000000
nstxout = 0
nstvout = 0
nstlog = 10000
nstxtcout = 5000
nst
xtc-precision = 10
rlist = 1.4
cutoff-scheme = verlet
verlet-buffer-drift = 0.005
ns-type = grid;
nstlist = 10;
coulombtype = PME
coulomb-modifier = Potential_shift
rcoulomb = 1.3
fourierspacing = 0.1625
pme_order = 4
epsilon_r = 15
vdw-type = cutoff
vdw-modifier = Potential-shift
epsilon_rf = 0
;rvdw-switch = 0.9
rvdw = 1.3
tcoupl = v-rescale
tc-grps = Protein DPPC_DOPC_POPE_CHOL_PAMS_POPS W_ION
tau-t = 1.0 1.0 1.0
ref-t = 323 323 323
Pcoupl = parrinello-rahman
Pcoupltype = isotropic; semiisotropic ;
tau-p = 12.0 12.0
compressibility = 3e-4 3e-4
ref-p = 1.0 1.0
refcoord_scaling = all
I am wondering that why it is difficult to run same thing in cluster
computer. Can any one help me to understand these errors?
Many Thanks
Dammar
More information about the gromacs.org_gmx-users
mailing list