[gmx-users] Simulations on GPU
Igor Leontyev
ileontyev at ucdavis.edu
Tue Oct 12 03:00:28 CEST 2010
Now I am able to run simulations on GPU but the output is weird. For
example, temperature drops down to 270K while ref_t=298 (Tcoupl=andersen).
Moreover, after several hours of simulations mdrun-gpu starts to output
"NAN" energies and hangs up. Pre-run and post-run GPU memory test is always
passed. The graphics card is that provided with HP desktops (might be MSI)
NVIDIA GTX260 with 1.8Gb memory. The output of mdrun and mdrun-gpu versions
of Gromacs is given bellow. Any ideas? Thanks.
Igor
////////////////////////////////////////////////////////////////////////////////////////////////////
Log file opened on Fri Oct 8 14:46:51 2010
Host: powerpc pid: 32083 nodeid: 0 nnodes: 4
The Gromacs distribution was built Thu Sep 30 14:42:48 PDT 2010 by
leontyev at powerpc (Linux 2.6.32-22-generic x86_64)
:-) G R O M A C S (-:
Gromacs Runs One Microsecond At Cannonball Speeds
:-) VERSION 4.5.1 (-:
Written by Emile Apol, Rossen Apostolov, Herman J.C. Berendsen,
Aldert van Buuren, Pär Bjelkmar, Rudi van Drunen, Anton Feenstra,
Gerrit Groenhof, Peter Kasson, Per Larsson, Peiter Meulenhoff,
Teemu Murtola, Szilard Pall, Sander Pronk, Roland Schultz,
Michael Shirts, Alfons Sijbers, Peter Tieleman,
Berk Hess, David van der Spoel, and Erik Lindahl.
Copyright (c) 1991-2000, University of Groningen, The Netherlands.
Copyright (c) 2001-2010, The GROMACS development team at
Uppsala University & The Royal Institute of Technology, Sweden.
check out http://www.gromacs.org for more information.
This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License
as published by the Free Software Foundation; either version 2
of the License, or (at your option) any later version.
:-) /usr/local/opt/bin/gromacs/gromacs-4.5.1/bin/mdrun_mpich2 (-:
Input Parameters:
integrator = md
nsteps = 10000
init_step = 0
ns_type = Grid
nstlist = 10
ndelta = 2
nstcomm = 1003
comm_mode = Linear
nstlog = 1000
nstxout = 5000
nstvout = 10000000
nstfout = 0
nstcalcenergy = 10
nstenergy = 1000
nstxtcout = 0
init_t = 0
delta_t = 0.001
xtcprec = 1000
nkx = 54
nky = 60
nkz = 90
pme_order = 6
ewald_rtol = 1e-05
ewald_geometry = 0
epsilon_surface = 0
optimize_fft = TRUE
ePBC = xyz
bPeriodicMols = FALSE
bContinuation = FALSE
bShakeSOR = FALSE
etc = Andersen
nsttcouple = 10
epc = Berendsen
epctype = Isotropic
nstpcouple = 10
tau_p = 0.5
ref_p (3x3):
ref_p[ 0]={ 1.01325e+00, 0.00000e+00, 0.00000e+00}
ref_p[ 1]={ 0.00000e+00, 1.01325e+00, 0.00000e+00}
ref_p[ 2]={ 0.00000e+00, 0.00000e+00, 1.01325e+00}
compress (3x3):
compress[ 0]={ 4.50000e-05, 0.00000e+00, 0.00000e+00}
compress[ 1]={ 0.00000e+00, 4.50000e-05, 0.00000e+00}
compress[ 2]={ 0.00000e+00, 0.00000e+00, 4.50000e-05}
refcoord_scaling = No
posres_com (3):
posres_com[0]= 0.00000e+00
posres_com[1]= 0.00000e+00
posres_com[2]= 0.00000e+00
posres_comB (3):
posres_comB[0]= 0.00000e+00
posres_comB[1]= 0.00000e+00
posres_comB[2]= 0.00000e+00
andersen_seed = 815131
rlist = 1.2
rlistlong = 1.2
rtpi = 0.05
coulombtype = PME
rcoulomb_switch = 0
rcoulomb = 1.2
vdwtype = Cut-off
rvdw_switch = 0
rvdw = 1.2
epsilon_r = 1
epsilon_rf = 1
tabext = 1
implicit_solvent = No
gb_algorithm = Still
gb_epsilon_solvent = 80
nstgbradii = 1
rgbradii = 1
gb_saltconc = 0
gb_obc_alpha = 1
gb_obc_beta = 0.8
gb_obc_gamma = 4.85
gb_dielectric_offset = 0.009
sa_algorithm = No
sa_surface_tension = 2.092
DispCorr = EnerPres
free_energy = no
init_lambda = 0
delta_lambda = 0
n_foreign_lambda = 0
sc_alpha = 0
sc_power = 0
sc_sigma = 0.3
sc_sigma_min = 0.3
nstdhdl = 10
separate_dhdl_file = yes
dhdl_derivatives = yes
dh_hist_size = 0
dh_hist_spacing = 0.1
nwall = 0
wall_type = 9-3
wall_atomtype[0] = -1
wall_atomtype[1] = -1
wall_density[0] = 0
wall_density[1] = 0
wall_ewald_zfac = 3
pull = no
disre = No
disre_weighting = Conservative
disre_mixed = FALSE
dr_fc = 1000
dr_tau = 0
nstdisreout = 100
orires_fc = 0
orires_tau = 0
nstorireout = 100
dihre-fc = 1000
em_stepsize = 0.01
em_tol = 10
niter = 20
fc_stepsize = 0
nstcgsteep = 1000
nbfgscorr = 10
ConstAlg = Lincs
shake_tol = 0.0001
lincs_order = 8
lincs_warnangle = 30
lincs_iter = 4
bd_fric = 0
ld_seed = 1993
cos_accel = 0
deform (3x3):
deform[ 0]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
deform[ 1]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
deform[ 2]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
userint1 = 0
userint2 = 0
userint3 = 0
userint4 = 0
userreal1 = 0
userreal2 = 0
userreal3 = 0
userreal4 = 0
grpopts:
nrdf: 99021
ref_t: 298.15
tau_t: 0.3
anneal: No
ann_npoints: 0
acc: 0 0 0
nfreeze: Y Y Y N N
N
energygrp_flags[ 0]: 0 0
energygrp_flags[ 1]: 0 0
efield-x:
n = 0
efield-xt:
n = 0
efield-y:
n = 0
efield-yt:
n = 0
efield-z:
n = 0
efield-zt:
n = 0
bQMMM = FALSE
QMconstraints = 0
QMMMscheme = 0
scalefactor = 1
qm_opts:
ngQM = 0
Initializing Domain Decomposition on 4 nodes
Dynamic load balancing: auto
Will sort the charge groups at every domain (re)decomposition
Initial maximum inter charge-group distances:
two-body bonded interactions: 0.585 nm, LJ-14, atoms 10901 11433
multi-body bonded interactions: 0.482 nm, Ryckaert-Bell., atoms 11431
11935
Minimum cell size due to bonded interactions: 0.530 nm
Maximum distance for 9 constraints, at 120 deg. angles, all-trans: 0.218 nm
Estimated maximum distance required for P-LINCS: 0.218 nm
Using 0 separate PME nodes
Scaling the initial minimum size with 1/0.8 (option -dds) = 1.25
Optimizing the DD grid for 4 cells with a minimum initial size of 0.663 nm
The maximum allowed number of cells is: X 9 Y 10 Z 16
Domain decomposition grid 1 x 4 x 1, separate PME nodes 0
PME domain decomposition: 1 x 4 x 1
Domain decomposition nodeid 0, coordinates 0 0 0
Table routines are used for coulomb: TRUE
Table routines are used for vdw: FALSE
Will do PME sum in reciprocal space.
Will do ordinary reciprocal space Ewald sum.
Using a Gaussian width (1/beta) of 0.384195 nm for Ewald
Cut-off's: NS: 1.2 Coulomb: 1.2 LJ: 1.2
Long Range LJ corr.: <C6> 4.0351e-04
System total charge: -0.000
Generated table with 1100 data points for Ewald.
Tabscale = 500 points/nm
Generated table with 1100 data points for LJ6.
Tabscale = 500 points/nm
Generated table with 1100 data points for LJ12.
Tabscale = 500 points/nm
Generated table with 1100 data points for 1-4 COUL.
Tabscale = 500 points/nm
Generated table with 1100 data points for 1-4 LJ6.
Tabscale = 500 points/nm
Generated table with 1100 data points for 1-4 LJ12.
Tabscale = 500 points/nm
Enabling SPC-like water optimization for 11505 molecules.
Configuring nonbonded kernels...
Configuring standard C nonbonded kernels...
Testing x86_64 SSE2 support... present.
Removing pbc first time
Initializing Parallel LINear Constraint Solver
Linking all bonded interactions to atoms
There are 65716 inter charge-group exclusions,
will use an extra communication step for exclusion forces for PME
The initial number of communication pulses is: Y 1
The initial domain decomposition cell size is: Y 1.77 nm
The maximum allowed distance for charge groups involved in interactions is:
non-bonded interactions 1.200 nm
(the following are initial values, they could change due to box deformation)
two-body bonded interactions (-rdd) 1.200 nm
multi-body bonded interactions (-rdd) 1.200 nm
atoms separated by up to 9 constraints (-rcon) 1.773 nm
When dynamic load balancing gets turned on, these settings will change to:
The maximum number of communication pulses is: Y 1
The minimum size for domain decomposition cells is 1.200 nm
The requested allowed shrink of DD cells (option -dds) is: 0.80
The allowed shrink of domain decomposition cells is: Y 0.68
The maximum allowed distance for charge groups involved in interactions is:
non-bonded interactions 1.200 nm
two-body bonded interactions (-rdd) 1.200 nm
multi-body bonded interactions (-rdd) 1.200 nm
atoms separated by up to 9 constraints (-rcon) 1.200 nm
Making 1D domain decomposition grid 1 x 4 x 1, home cell index 0 0 0
Center of mass motion removal mode is Linear
We have the following groups for center of mass motion removal:
0: rest
There are: 46503 Atoms
Charge group distribution at step 0: 4533 7043 7334 4581
Grid: 10 x 6 x 17 cells
Constraining the starting coordinates (step 0)
Constraining the coordinates at t0-dt (step 0)
RMS relative constraint deviation after constraining: 7.96e-07
Initial temperature: 297.745 K
Started mdrun on node 0 Fri Oct 8 14:46:51 2010
Step Time Lambda
0 0.00000 0.00000
Energies (kJ/mol)
Bond Angle Proper Dih. Ryckaert-Bell. LJ-14
9.26629e+03 2.53358e+04 1.36779e+03 2.97600e+04 1.20809e+04
Coulomb-14 LJ (SR) Disper. corr. Coulomb (SR) Coul. recip.
1.40505e+05 3.83498e+04 -2.30989e+03 -5.95333e+05 -1.96357e+05
Potential Kinetic En. Total Energy Temperature Pres. DC (bar)
-5.37334e+05 1.22595e+05 -4.14739e+05 2.97810e+02 -1.67546e+02
Pressure (bar) Constr. rmsd
2.67468e+00 1.03652e-06
DD step 9 load imb.: force 19.9%
At step 10 the performance loss due to force load imbalance is 9.3 %
NOTE: Turning on dynamic load balancing
DD step 999 vol min/aver 0.777 load imb.: force 0.1%
Step Time Lambda
1000 1.00000 0.00000
Energies (kJ/mol)
Bond Angle Proper Dih. Ryckaert-Bell. LJ-14
9.29054e+03 2.49530e+04 1.43296e+03 2.96188e+04 1.19777e+04
Coulomb-14 LJ (SR) Disper. corr. Coulomb (SR) Coul. recip.
1.40496e+05 3.99112e+04 -2.30308e+03 -5.96482e+05 -1.96429e+05
Potential Kinetic En. Total Energy Temperature Pres. DC (bar)
-5.37533e+05 1.22974e+05 -4.14560e+05 2.98729e+02 -1.66560e+02
Pressure (bar) Constr. rmsd
-1.40877e+02 1.04647e-06
DD step 1999 vol min/aver 0.773 load imb.: force 0.1%
................................................................................
Step Time Lambda
10000 10.00000 0.00000
Writing checkpoint, step 10000 at Fri Oct 8 14:58:26 2010
Energies (kJ/mol)
Bond Angle Proper Dih. Ryckaert-Bell. LJ-14
9.00658e+03 2.52059e+04 1.34920e+03 2.95995e+04 1.19606e+04
Coulomb-14 LJ (SR) Disper. corr. Coulomb (SR) Coul. recip.
1.40474e+05 4.00471e+04 -2.30290e+03 -5.96601e+05 -1.96374e+05
Potential Kinetic En. Total Energy Temperature Pres. DC (bar)
-5.37636e+05 1.22577e+05 -4.15059e+05 2.97765e+02 -1.66533e+02
Pressure (bar) Constr. rmsd
-5.69272e+01 1.04191e-06
<====== ############### ==>
<==== A V E R A G E S ====>
<== ############### ======>
Statistics over 10001 steps using 1001 frames
Energies (kJ/mol)
Bond Angle Proper Dih. Ryckaert-Bell. LJ-14
9.11274e+03 2.49545e+04 1.36688e+03 2.96269e+04 1.20386e+04
Coulomb-14 LJ (SR) Disper. corr. Coulomb (SR) Coul. recip.
1.40680e+05 3.95513e+04 -2.30457e+03 -5.95701e+05 -1.96403e+05
Potential Kinetic En. Total Energy Temperature Pres. DC (bar)
-5.37077e+05 1.22248e+05 -4.14829e+05 2.96967e+02 -1.66776e+02
Pressure (bar) Constr. rmsd
4.32167e+00 0.00000e+00
Box-X Box-Y Box-Z
6.01417e+00 7.09874e+00 1.07493e+01
Total Virial (kJ/mol)
4.10242e+04 -9.05005e+00 -2.30129e+02
-1.20791e+01 4.05914e+04 1.71615e+02
-2.13770e+02 1.99254e+02 4.04540e+04
Pressure (bar)
-1.22617e+01 -8.98547e-01 1.93020e+01
-6.78561e-01 2.15880e+01 -8.68094e+00
1.81194e+01 -1.06823e+01 3.63870e+00
Total Dipole (D)
4.73145e+02 -1.30311e+03 -2.15240e+02
Epot (kJ/mol) Coul-SR LJ-SR Coul-14 LJ-14
glu242side-glu242side 2.99268e+00 0.00000e+00 -1.85865e+02
1.35027e+00
glu242side-rest -5.15085e+01 -2.83484e+01 2.08195e+01 4.24614e+00
rest-rest -5.95653e+05 3.95797e+04 1.40846e+05 1.20330e+04
M E G A - F L O P S A C C O U N T I N G
RF=Reaction-Field FE=Free Energy SCFE=Soft-Core/Free Energy
T=Tabulated W3=SPC/TIP3p W4=TIP4p (single or pairs)
NF=No Forces
Computing: M-Number M-Flops % Flops
-----------------------------------------------------------------------------
Coul(T) 10781.556318 452825.365 4.6
Coul(T) [W3] 70.655819 8831.977 0.1
Coul(T) + LJ 34247.547832 1883615.131 18.9
Coul(T) + LJ [W3] 4684.616330 646477.054 6.5
Coul(T) + LJ [W3-W3] 12244.355656 4677343.861 47.0
Outer nonbonded loop 2334.588434 23345.884 0.2
1,4 nonbonded interactions 314.111408 28270.027 0.3
Calc Weights 1395.229509 50228.262 0.5
Spread Q Bspline 100456.524648 200913.049 2.0
Gather F Bspline 100456.524648 602739.148 6.1
3D-FFT 105882.547196 847060.378 8.5
Solve PME 1490.549040 95395.139 1.0
NS-Pairs 12131.757207 254766.901 2.6
Reset In Box 23.514491 70.543 0.0
CG-CoM 46.596006 139.788 0.0
Bonds 61.886188 3651.285 0.0
Angles 219.991997 36958.655 0.4
Propers 23.872387 5466.777 0.1
RB-Dihedrals 253.765374 62680.047 0.6
Virial 46.729683 841.134 0.0
Stop-CM 0.465030 4.650 0.0
P-Coupling 465.076503 2790.459 0.0
Calc-Ekin 465.123006 12558.321 0.1
Lincs 62.989618 3779.377 0.0
Lincs-Mat 569.895960 2279.584 0.0
Constraint-V 471.185790 3769.486 0.0
Constraint-Vir 40.852892 980.469 0.0
Settle 115.084515 37172.298 0.4
-----------------------------------------------------------------------------
Total 9944955.052 100.0
-----------------------------------------------------------------------------
D O M A I N D E C O M P O S I T I O N S T A T I S T I C S
av. #atoms communicated per step for force: 2 x 31556.3
av. #atoms communicated per step for LINCS: 5 x 512.1
Average load imbalance: 0.5 %
Part of the total run time spent waiting due to load imbalance: 0.3 %
Steps where the load balancing was limited by -rdd, -rcon and/or -dds: Y 0
%
R E A L C Y C L E A N D T I M E A C C O U N T I N G
Computing: Nodes Number G-Cycles Seconds %
-----------------------------------------------------------------------
Domain decomp. 4 1001 56.954 21.4 0.8
DD comm. load 4 1000 0.206 0.1 0.0
DD comm. bounds 4 1000 2.836 1.1 0.0
Comm. coord. 4 10001 30.480 11.5 0.4
Neighbor search 4 1001 579.978 218.1 7.8
Force 4 10001 4548.315 1710.0 61.5
Wait + Comm. F 4 10001 17.520 6.6 0.2
PME mesh 4 10001 1897.783 713.5 25.7
Write traj. 4 3 0.668 0.3 0.0
Update 4 10001 45.142 17.0 0.6
Constraints 4 10001 181.826 68.4 2.5
Comm. energies 4 1011 3.026 1.1 0.0
Rest 4 31.895 12.0 0.4
-----------------------------------------------------------------------
Total 4 7396.630 2780.9 100.0
-----------------------------------------------------------------------
-----------------------------------------------------------------------
PME redist. X/F 4 20002 208.454 78.4 2.8
PME spread/gather 4 20002 1440.827 541.7 19.5
PME 3D-FFT 4 20002 203.508 76.5 2.8
PME solve 4 10001 44.697 16.8 0.6
-----------------------------------------------------------------------
Parallel run - timing based on wallclock.
NODE (s) Real (s) (%)
Time: 695.218 695.218 100.0
11:35
(Mnbf/s) (GFlops) (ns/day) (hour/ns)
Performance: 243.800 14.305 1.243 19.310
Finished mdrun on node 0 Fri Oct 8 14:58:27 2010
////////////////////////////////////////////////////////////////////////////////////////////////////
:-) G R O M A C S (-:
Groningen Machine for Chemical Simulation
:-) VERSION 4.5.1-dev-20101006-d3b58 (-:
Written by Emile Apol, Rossen Apostolov, Herman J.C. Berendsen,
Aldert van Buuren, Pär Bjelkmar, Rudi van Drunen, Anton Feenstra,
Gerrit Groenhof, Peter Kasson, Per Larsson, Pieter Meulenhoff,
Teemu Murtola, Szilard Pall, Sander Pronk, Roland Schultz,
Michael Shirts, Alfons Sijbers, Peter Tieleman,
Berk Hess, David van der Spoel, and Erik Lindahl.
Copyright (c) 1991-2000, University of Groningen, The Netherlands.
Copyright (c) 2001-2010, The GROMACS development team at
Uppsala University & The Royal Institute of Technology, Sweden.
check out http://www.gromacs.org for more information.
This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License
as published by the Free Software Foundation; either version 2
of the License, or (at your option) any later version.
:-) /home/leontyev/programs/bin/gromacs/gromacs-4.5.1-gpu/bin/mdrun-gpu
(-:
Input Parameters:
integrator = md
nsteps = 10000
init_step = 0
ns_type = Grid
nstlist = 10
ndelta = 2
nstcomm = 1003
comm_mode = Linear
nstlog = 1000
nstxout = 5000
nstvout = 10000000
nstfout = 0
nstcalcenergy = 10
nstenergy = 1000
nstxtcout = 0
init_t = 0
delta_t = 0.001
xtcprec = 1000
nkx = 54
nky = 60
nkz = 90
pme_order = 6
ewald_rtol = 1e-05
ewald_geometry = 0
epsilon_surface = 0
optimize_fft = TRUE
ePBC = xyz
bPeriodicMols = FALSE
bContinuation = FALSE
bShakeSOR = FALSE
etc = Andersen
nsttcouple = 10
epc = Berendsen
epctype = Isotropic
nstpcouple = 10
tau_p = 0.5
ref_p (3x3):
ref_p[ 0]={ 1.01325e+00, 0.00000e+00, 0.00000e+00}
ref_p[ 1]={ 0.00000e+00, 1.01325e+00, 0.00000e+00}
ref_p[ 2]={ 0.00000e+00, 0.00000e+00, 1.01325e+00}
compress (3x3):
compress[ 0]={ 4.50000e-05, 0.00000e+00, 0.00000e+00}
compress[ 1]={ 0.00000e+00, 4.50000e-05, 0.00000e+00}
compress[ 2]={ 0.00000e+00, 0.00000e+00, 4.50000e-05}
refcoord_scaling = No
posres_com (3):
posres_com[0]= 0.00000e+00
posres_com[1]= 0.00000e+00
posres_com[2]= 0.00000e+00
posres_comB (3):
posres_comB[0]= 0.00000e+00
posres_comB[1]= 0.00000e+00
posres_comB[2]= 0.00000e+00
andersen_seed = 815131
rlist = 1.2
rlistlong = 1.2
rtpi = 0.05
coulombtype = PME
rcoulomb_switch = 0
rcoulomb = 1.2
vdwtype = Cut-off
rvdw_switch = 0
rvdw = 1.2
epsilon_r = 1
epsilon_rf = 1
tabext = 1
implicit_solvent = No
gb_algorithm = Still
gb_epsilon_solvent = 80
nstgbradii = 1
rgbradii = 1
gb_saltconc = 0
gb_obc_alpha = 1
gb_obc_beta = 0.8
gb_obc_gamma = 4.85
gb_dielectric_offset = 0.009
sa_algorithm = Ace-approximation
sa_surface_tension = 2.092
DispCorr = EnerPres
free_energy = no
init_lambda = 0
delta_lambda = 0
n_foreign_lambda = 0
sc_alpha = 0
sc_power = 0
sc_sigma = 0.3
sc_sigma_min = 0.3
nstdhdl = 10
separate_dhdl_file = yes
dhdl_derivatives = yes
dh_hist_size = 0
dh_hist_spacing = 0.1
nwall = 0
wall_type = 9-3
wall_atomtype[0] = -1
wall_atomtype[1] = -1
wall_density[0] = 0
wall_density[1] = 0
wall_ewald_zfac = 3
pull = no
disre = No
disre_weighting = Conservative
disre_mixed = FALSE
dr_fc = 1000
dr_tau = 0
nstdisreout = 100
orires_fc = 0
orires_tau = 0
nstorireout = 100
dihre-fc = 1000
em_stepsize = 0.01
em_tol = 10
niter = 20
fc_stepsize = 0
nstcgsteep = 1000
nbfgscorr = 10
ConstAlg = Lincs
shake_tol = 0.0001
lincs_order = 8
lincs_warnangle = 30
lincs_iter = 4
bd_fric = 0
ld_seed = 1993
cos_accel = 0
deform (3x3):
deform[ 0]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
deform[ 1]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
deform[ 2]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
userint1 = 0
userint2 = 0
userint3 = 0
userint4 = 0
userreal1 = 0
userreal2 = 0
userreal3 = 0
userreal4 = 0
grpopts:
nrdf: 99021
ref_t: 298.15
tau_t: 0.3
anneal: No
ann_npoints: 0
acc: 0 0 0
nfreeze: Y Y Y N N
N
energygrp_flags[ 0]: 0 0
energygrp_flags[ 1]: 0 0
efield-x:
n = 0
efield-xt:
n = 0
efield-y:
n = 0
efield-yt:
n = 0
efield-z:
n = 0
efield-zt:
n = 0
bQMMM = FALSE
QMconstraints = 0
QMMMscheme = 0
scalefactor = 1
qm_opts:
ngQM = 0
Table routines are used for coulomb: TRUE
Table routines are used for vdw: FALSE
Will do PME sum in reciprocal space.
Will do ordinary reciprocal space Ewald sum.
Using a Gaussian width (1/beta) of 0.384195 nm for Ewald
Cut-off's: NS: 1.2 Coulomb: 1.2 LJ: 1.2
Long Range LJ corr.: <C6> 4.0351e-04
System total charge: -0.000
Generated table with 1100 data points for Ewald.
Tabscale = 500 points/nm
Generated table with 1100 data points for LJ6.
Tabscale = 500 points/nm
Generated table with 1100 data points for LJ12.
Tabscale = 500 points/nm
Generated table with 1100 data points for 1-4 COUL.
Tabscale = 500 points/nm
Generated table with 1100 data points for 1-4 LJ6.
Tabscale = 500 points/nm
Generated table with 1100 data points for 1-4 LJ12.
Tabscale = 500 points/nm
Enabling SPC-like water optimization for 11505 molecules.
Configuring nonbonded kernels...
Configuring standard C nonbonded kernels...
Removing pbc first time
Initializing LINear Constraint Solver
Center of mass motion removal mode is Linear
We have the following groups for center of mass motion removal:
0: rest
Max number of connections per atom is 91
Total number of connections is 387700
Max number of graph edges per atom is 6
Total number of graph edges is 70330
OpenMM plugins loaded from directory
/home/leontyev/programs/bin/gromacs/OpenMM2.0-Linux64/lib/plugins:
libOpenMMCuda.so, libOpenMMOpenCL.so,
The combination rule of the used force field matches the one used by OpenMM.
Gromacs will use the OpenMM platform: Cuda
Gromacs will run on the GPU #0 (GeForce GTX 260).
Pre-simulation ~15s memtest in progress...
Memory test completed without errors.
Constraining the starting coordinates (step 0)
Constraining the coordinates at t0-dt (step 0)
Initial temperature: 0 K
Started mdrun on node 0 Fri Oct 8 16:54:04 2010
Step Time Lambda
0 0.00000 0.00000
Energies (kJ/mol)
Potential Kinetic En. Total Energy Temperature Constr. rmsd
-5.34934e+05 1.22629e+05 -4.12305e+05 2.97883e+02 1.03777e-06
Step Time Lambda
1000 1.00000 0.00000
Energies (kJ/mol)
Potential Kinetic En. Total Energy Temperature Constr. rmsd
-5.42963e+05 1.16609e+05 -4.26354e+05 2.83260e+02 1.03777e-06
Step Time Lambda
2000 2.00000 0.00000
Energies (kJ/mol)
Potential Kinetic En. Total Energy Temperature Constr. rmsd
-5.49782e+05 1.14408e+05 -4.35374e+05 2.77912e+02 1.03777e-06
Step Time Lambda
3000 3.00000 0.00000
Energies (kJ/mol)
Potential Kinetic En. Total Energy Temperature Constr. rmsd
-5.51337e+05 1.12705e+05 -4.38631e+05 2.73777e+02 1.03777e-06
Step Time Lambda
4000 4.00000 0.00000
Energies (kJ/mol)
Potential Kinetic En. Total Energy Temperature Constr. rmsd
-5.52340e+05 1.12827e+05 -4.39513e+05 2.74073e+02 1.03777e-06
Step Time Lambda
5000 5.00000 0.00000
Energies (kJ/mol)
Potential Kinetic En. Total Energy Temperature Constr. rmsd
-5.52599e+05 1.13543e+05 -4.39056e+05 2.75812e+02 1.03777e-06
Step Time Lambda
6000 6.00000 0.00000
Energies (kJ/mol)
Potential Kinetic En. Total Energy Temperature Constr. rmsd
-5.52946e+05 1.14271e+05 -4.38675e+05 2.77580e+02 1.03777e-06
Step Time Lambda
7000 7.00000 0.00000
Energies (kJ/mol)
Potential Kinetic En. Total Energy Temperature Constr. rmsd
-5.51992e+05 1.13521e+05 -4.38471e+05 2.75759e+02 1.03777e-06
Step Time Lambda
8000 8.00000 0.00000
Energies (kJ/mol)
Potential Kinetic En. Total Energy Temperature Constr. rmsd
-5.52834e+05 1.14111e+05 -4.38723e+05 2.77192e+02 1.03777e-06
Step Time Lambda
9000 9.00000 0.00000
Energies (kJ/mol)
Potential Kinetic En. Total Energy Temperature Constr. rmsd
-5.52806e+05 1.13783e+05 -4.39022e+05 2.76396e+02 1.03777e-06
Step Time Lambda
10000 10.00000 0.00000
Energies (kJ/mol)
Potential Kinetic En. Total Energy Temperature Constr. rmsd
-5.53230e+05 1.12594e+05 -4.40636e+05 2.73506e+02 1.03777e-06
Writing checkpoint, step 10000 at Fri Oct 8 17:06:02 2010
<====== ############### ==>
<==== A V E R A G E S ====>
<== ############### ======>
Statistics over 11 steps using 11 frames
Energies (kJ/mol)
Potential Kinetic En. Total Energy Temperature Constr. rmsd
-5.49797e+05 1.14636e+05 -4.35160e+05 2.78468e+02 0.00000e+00
Box-X Box-Y Box-Z
1.73572e+12 1.19301e-40 2.31720e+11
Total Virial (kJ/mol)
0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00
Pressure (bar)
0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00
Total Dipole (D)
0.00000e+00 0.00000e+00 0.00000e+00
Epot (kJ/mol) Coul-SR LJ-SR Coul-14 LJ-14
glu242side-glu242side 0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00
glu242side-rest 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
rest-rest 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
Post-simulation ~15s memtest in progress...
Memory test completed without errors.
M E G A - F L O P S A C C O U N T I N G
RF=Reaction-Field FE=Free Energy SCFE=Soft-Core/Free Energy
T=Tabulated W3=SPC/TIP3p W4=TIP4p (single or pairs)
NF=No Forces
Computing: M-Number M-Flops % Flops
-----------------------------------------------------------------------------
Lincs 0.011934 0.716 8.0
Lincs-Mat 0.106200 0.425 4.7
Constraint-V 0.046449 0.372 4.2
Settle 0.023010 7.432 83.1
-----------------------------------------------------------------------------
Total 8.945 100.0
-----------------------------------------------------------------------------
R E A L C Y C L E A N D T I M E A C C O U N T I N G
Computing: Nodes Number G-Cycles Seconds %
-----------------------------------------------------------------------
Write traj. 1 11 3.033 1.1 0.2
Rest 1 1978.521 716.9 99.8
-----------------------------------------------------------------------
Total 1 1981.554 718.0 100.0
-----------------------------------------------------------------------
OpenMM run - timing based on wallclock.
NODE (s) Real (s) (%)
Time: 717.970 717.970 100.0
11:57
(Mnbf/s) (MFlops) (ns/day) (hour/ns)
Performance: 0.000 0.012 1.204 19.942
Finished mdrun on node 0 Fri Oct 8 17:06:02 2010
////////////////////////////////////////////////////////////////////////////////////////////////////
> Igor Leontyev wrote:
>
> Finally, I compiled and ran simulations with gpu version of gromacs-4.5.1.
> There were several issues:
>
> 1) Precompiled OpenMM2.0 libraries and headers must be downloaded (which
> requires registration on their web page) and installed, otherwise cmake
> doesn't find some source files.
>
> 2) cmake should be called outside the original source directory with the
> path of the directory as an argument.
>
> 3) To run the obtained mdrun-gpu binary the CUDA dev driver should be
> installed, otherwise the program does not find 'CUDA'. This step appeared
> to
> be the most problematic for me. According to OpenMM manual the driver must
> be installed with turned off x-windows service which can be done by the
> command "init 3". In Ubuntu this command has no effect, while switching
> the
> graphical interface off/on is done by
>
> "sudo service gdm stop/start"
>
> It turned out that in Ubuntu-10.04 the CUDA driver installation script
> does
> not work properly even with turned off gdm. This issue and its solution is
> described at http://ubuntuforums.org/showthread.php?t=1467074
>
> Thank you for comments,
>
> Igor
>>
>>
>> Szilárd Páll wrote:
>> Dear Igor,
>>
>> Your output look _very_ weird, it seems as if CMake internal
>> variable(s) were not initialized, which I have no clue how could have
>> happened - the build generator works just fine for me. The only thing
>> I can think of is that maybe your CMakeCache is corrupted.
>>
>> Could you please rerun cmake in a _clean_ build directory? Also, are
>> you able to run cmake for CPU build (no -D options)?
>>
>> --
>> Szilárd
>>
>>> Szilárd wrote:
>>>>
>>>> The beta versions are all outdated, could you please use the latest
>>>> source distribution (4.5.1) instead (or git from the
>>>> release-4-5-patches branch)?
>>>
>>> The result is the same for both the distribution 4.5.1 and git from the
>>> release-4-5-patches. See the output bellow.
>>> =========================================
>>>
>>> PATH=/usr/local/opt/bin/mpi/openmpi-1.4.2/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games
>>> LD_LIBRARY_PATH=/usr/local/opt/bin/mpi/openmpi-1.4.2/lib:/home/leontyev/programs/bin/cuda/lib64:
>>> CPPFLAGS=-I//usr/local/opt/bin/gromacs/fftw-3.2.2/single_sse/include
>>> -I//usr/local/opt/bin/mpi/openmpi-1.4.2/include
>>> LDFLAGS=-L//usr/local/opt/bin/gromacs/fftw-3.2.2/single_sse/lib
>>> -L//usr/local/opt/bin/mpi/openmpi-1.4.2/lib
>>> OPENMM_ROOT_DIR=/home/leontyev/programs/bin/gromacs/gromacs-4.5.1-git/openmm
>>>
>>> cmake src -DGMX_OPENMM=ON -DGMX_THREADS=OFF
>>> -DCMAKE_INSTALL_PREFIX=/home/leontyev/programs/bin/gromacs/gromacs-4.5.1-git
>>> CMake Error at gmxlib/CMakeLists.txt:124 (set_target_properties):
>>> set_target_properties called with incorrect number of arguments.
>>>
>>>
>>> CMake Error at gmxlib/CMakeLists.txt:126 (install):
>>> install TARGETS given no ARCHIVE DESTINATION for static library target
>>> "gmx".
>>>
>>>
>>> CMake Error at mdlib/CMakeLists.txt:11 (set_target_properties):
>>> set_target_properties called with incorrect number of arguments.
>>>
>>>
>>> CMake Error at mdlib/CMakeLists.txt:13 (install):
>>> install TARGETS given no ARCHIVE DESTINATION for static library target
>>> "md".
>>>
>>>
>>> CMake Error at kernel/CMakeLists.txt:43 (set_target_properties):
>>> set_target_properties called with incorrect number of arguments.
>>>
>>>
>>> CMake Error at kernel/CMakeLists.txt:44 (set_target_properties):
>>> set_target_properties called with incorrect number of arguments.
>>>
>>>
>>> CMake Error at kernel/gmx_gpu_utils/CMakeLists.txt:18
>>> (CUDA_INCLUDE_DIRECTORIES):
>>> Unknown CMake command "CUDA_INCLUDE_DIRECTORIES".
>>>
>>>
>>> CMake Warning (dev) in CMakeLists.txt:
>>> No cmake_minimum_required command is present. A line of code such as
>>>
>>> cmake_minimum_required(VERSION 2.8)
>>>
>>> should be added at the top of the file. The version specified may be
>>> lower
>>> if you wish to support older CMake versions for this project. For more
>>> information run "cmake --help-policy CMP0000".
>>> This warning is for project developers. Use -Wno-dev to suppress it.
>>>
>>> -- Configuring incomplete, errors occurred!
More information about the gromacs.org_gmx-users
mailing list