[gmx-users] Simulations on GPU

Francesco Oteri francesco.oteri at gmail.com
Tue Jan 4 15:16:23 CET 2011


Maybe the used force-field is wrong.
mdrun-gpu is able to use the AMBER forcefield, but it is not possible 
using gromos force-field.
When gromos force-field is used, the output is full of NaN

Il 12/10/2010 03:00, Igor Leontyev ha scritto:
> Now I am able to run simulations on GPU but the output is weird. For 
> example, temperature drops down to 270K while ref_t=298 
> (Tcoupl=andersen). Moreover, after several hours of simulations 
> mdrun-gpu starts to output "NAN" energies and hangs up. Pre-run and 
> post-run GPU memory test is always passed. The graphics card is that 
> provided with HP desktops (might be MSI) NVIDIA GTX260 with 1.8Gb 
> memory. The output of mdrun and mdrun-gpu versions of Gromacs is given 
> bellow. Any ideas? Thanks.
>
> Igor
>
> //////////////////////////////////////////////////////////////////////////////////////////////////// 
>
> Log file opened on Fri Oct  8 14:46:51 2010
> Host: powerpc  pid: 32083  nodeid: 0  nnodes:  4
> The Gromacs distribution was built Thu Sep 30 14:42:48 PDT 2010 by
> leontyev at powerpc (Linux 2.6.32-22-generic x86_64)
>
>
>                         :-)  G  R  O  M  A  C  S  (-:
>
>               Gromacs Runs One Microsecond At Cannonball Speeds
>
>                            :-)  VERSION 4.5.1  (-:
>
>        Written by Emile Apol, Rossen Apostolov, Herman J.C. Berendsen,
>      Aldert van Buuren, Pär Bjelkmar, Rudi van Drunen, Anton Feenstra,
>        Gerrit Groenhof, Peter Kasson, Per Larsson, Peiter Meulenhoff,
>          Teemu Murtola, Szilard Pall, Sander Pronk, Roland Schultz,
>                Michael Shirts, Alfons Sijbers, Peter Tieleman,
>
>               Berk Hess, David van der Spoel, and Erik Lindahl.
>
>       Copyright (c) 1991-2000, University of Groningen, The Netherlands.
>            Copyright (c) 2001-2010, The GROMACS development team at
>        Uppsala University & The Royal Institute of Technology, Sweden.
>            check out http://www.gromacs.org for more information.
>
>         This program is free software; you can redistribute it and/or
>          modify it under the terms of the GNU General Public License
>         as published by the Free Software Foundation; either version 2
>             of the License, or (at your option) any later version.
>
>      :-)  /usr/local/opt/bin/gromacs/gromacs-4.5.1/bin/mdrun_mpich2  (-:
>
>
>
> Input Parameters:
>   integrator           = md
>   nsteps               = 10000
>   init_step            = 0
>   ns_type              = Grid
>   nstlist              = 10
>   ndelta               = 2
>   nstcomm              = 1003
>   comm_mode            = Linear
>   nstlog               = 1000
>   nstxout              = 5000
>   nstvout              = 10000000
>   nstfout              = 0
>   nstcalcenergy        = 10
>   nstenergy            = 1000
>   nstxtcout            = 0
>   init_t               = 0
>   delta_t              = 0.001
>   xtcprec              = 1000
>   nkx                  = 54
>   nky                  = 60
>   nkz                  = 90
>   pme_order            = 6
>   ewald_rtol           = 1e-05
>   ewald_geometry       = 0
>   epsilon_surface      = 0
>   optimize_fft         = TRUE
>   ePBC                 = xyz
>   bPeriodicMols        = FALSE
>   bContinuation        = FALSE
>   bShakeSOR            = FALSE
>   etc                  = Andersen
>   nsttcouple           = 10
>   epc                  = Berendsen
>   epctype              = Isotropic
>   nstpcouple           = 10
>   tau_p                = 0.5
>   ref_p (3x3):
>      ref_p[    0]={ 1.01325e+00,  0.00000e+00,  0.00000e+00}
>      ref_p[    1]={ 0.00000e+00,  1.01325e+00,  0.00000e+00}
>      ref_p[    2]={ 0.00000e+00,  0.00000e+00,  1.01325e+00}
>   compress (3x3):
>      compress[    0]={ 4.50000e-05,  0.00000e+00,  0.00000e+00}
>      compress[    1]={ 0.00000e+00,  4.50000e-05,  0.00000e+00}
>      compress[    2]={ 0.00000e+00,  0.00000e+00,  4.50000e-05}
>   refcoord_scaling     = No
>   posres_com (3):
>      posres_com[0]= 0.00000e+00
>      posres_com[1]= 0.00000e+00
>      posres_com[2]= 0.00000e+00
>   posres_comB (3):
>      posres_comB[0]= 0.00000e+00
>      posres_comB[1]= 0.00000e+00
>      posres_comB[2]= 0.00000e+00
>   andersen_seed        = 815131
>   rlist                = 1.2
>   rlistlong            = 1.2
>   rtpi                 = 0.05
>   coulombtype          = PME
>   rcoulomb_switch      = 0
>   rcoulomb             = 1.2
>   vdwtype              = Cut-off
>   rvdw_switch          = 0
>   rvdw                 = 1.2
>   epsilon_r            = 1
>   epsilon_rf           = 1
>   tabext               = 1
>   implicit_solvent     = No
>   gb_algorithm         = Still
>   gb_epsilon_solvent   = 80
>   nstgbradii           = 1
>   rgbradii             = 1
>   gb_saltconc          = 0
>   gb_obc_alpha         = 1
>   gb_obc_beta          = 0.8
>   gb_obc_gamma         = 4.85
>   gb_dielectric_offset = 0.009
>   sa_algorithm         = No
>   sa_surface_tension   = 2.092
>   DispCorr             = EnerPres
>   free_energy          = no
>   init_lambda          = 0
>   delta_lambda         = 0
>   n_foreign_lambda     = 0
>   sc_alpha             = 0
>   sc_power             = 0
>   sc_sigma             = 0.3
>   sc_sigma_min         = 0.3
>   nstdhdl              = 10
>   separate_dhdl_file   = yes
>   dhdl_derivatives     = yes
>   dh_hist_size         = 0
>   dh_hist_spacing      = 0.1
>   nwall                = 0
>   wall_type            = 9-3
>   wall_atomtype[0]     = -1
>   wall_atomtype[1]     = -1
>   wall_density[0]      = 0
>   wall_density[1]      = 0
>   wall_ewald_zfac      = 3
>   pull                 = no
>   disre                = No
>   disre_weighting      = Conservative
>   disre_mixed          = FALSE
>   dr_fc                = 1000
>   dr_tau               = 0
>   nstdisreout          = 100
>   orires_fc            = 0
>   orires_tau           = 0
>   nstorireout          = 100
>   dihre-fc             = 1000
>   em_stepsize          = 0.01
>   em_tol               = 10
>   niter                = 20
>   fc_stepsize          = 0
>   nstcgsteep           = 1000
>   nbfgscorr            = 10
>   ConstAlg             = Lincs
>   shake_tol            = 0.0001
>   lincs_order          = 8
>   lincs_warnangle      = 30
>   lincs_iter           = 4
>   bd_fric              = 0
>   ld_seed              = 1993
>   cos_accel            = 0
>   deform (3x3):
>      deform[    0]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
>      deform[    1]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
>      deform[    2]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
>   userint1             = 0
>   userint2             = 0
>   userint3             = 0
>   userint4             = 0
>   userreal1            = 0
>   userreal2            = 0
>   userreal3            = 0
>   userreal4            = 0
> grpopts:
>   nrdf:       99021
>   ref_t:      298.15
>   tau_t:         0.3
> anneal:          No
> ann_npoints:           0
>   acc:            0           0           0
>   nfreeze:           Y           Y           Y           N           N N
>   energygrp_flags[  0]: 0 0
>   energygrp_flags[  1]: 0 0
>   efield-x:
>      n = 0
>   efield-xt:
>      n = 0
>   efield-y:
>      n = 0
>   efield-yt:
>      n = 0
>   efield-z:
>      n = 0
>   efield-zt:
>      n = 0
>   bQMMM                = FALSE
>   QMconstraints        = 0
>   QMMMscheme           = 0
>   scalefactor          = 1
> qm_opts:
>   ngQM                 = 0
>
> Initializing Domain Decomposition on 4 nodes
> Dynamic load balancing: auto
> Will sort the charge groups at every domain (re)decomposition
> Initial maximum inter charge-group distances:
>    two-body bonded interactions: 0.585 nm, LJ-14, atoms 10901 11433
>  multi-body bonded interactions: 0.482 nm, Ryckaert-Bell., atoms 11431 
> 11935
> Minimum cell size due to bonded interactions: 0.530 nm
> Maximum distance for 9 constraints, at 120 deg. angles, all-trans: 
> 0.218 nm
> Estimated maximum distance required for P-LINCS: 0.218 nm
> Using 0 separate PME nodes
> Scaling the initial minimum size with 1/0.8 (option -dds) = 1.25
> Optimizing the DD grid for 4 cells with a minimum initial size of 
> 0.663 nm
> The maximum allowed number of cells is: X 9 Y 10 Z 16
> Domain decomposition grid 1 x 4 x 1, separate PME nodes 0
> PME domain decomposition: 1 x 4 x 1
> Domain decomposition nodeid 0, coordinates 0 0 0
>
> Table routines are used for coulomb: TRUE
> Table routines are used for vdw:     FALSE
> Will do PME sum in reciprocal space.
>
> Will do ordinary reciprocal space Ewald sum.
> Using a Gaussian width (1/beta) of 0.384195 nm for Ewald
> Cut-off's:   NS: 1.2   Coulomb: 1.2   LJ: 1.2
> Long Range LJ corr.: <C6> 4.0351e-04
> System total charge: -0.000
> Generated table with 1100 data points for Ewald.
> Tabscale = 500 points/nm
> Generated table with 1100 data points for LJ6.
> Tabscale = 500 points/nm
> Generated table with 1100 data points for LJ12.
> Tabscale = 500 points/nm
> Generated table with 1100 data points for 1-4 COUL.
> Tabscale = 500 points/nm
> Generated table with 1100 data points for 1-4 LJ6.
> Tabscale = 500 points/nm
> Generated table with 1100 data points for 1-4 LJ12.
> Tabscale = 500 points/nm
>
> Enabling SPC-like water optimization for 11505 molecules.
>
> Configuring nonbonded kernels...
> Configuring standard C nonbonded kernels...
> Testing x86_64 SSE2 support... present.
>
>
> Removing pbc first time
>
> Initializing Parallel LINear Constraint Solver
>
> Linking all bonded interactions to atoms
> There are 65716 inter charge-group exclusions,
> will use an extra communication step for exclusion forces for PME
>
> The initial number of communication pulses is: Y 1
> The initial domain decomposition cell size is: Y 1.77 nm
>
> The maximum allowed distance for charge groups involved in 
> interactions is:
>                 non-bonded interactions           1.200 nm
> (the following are initial values, they could change due to box 
> deformation)
>            two-body bonded interactions  (-rdd)   1.200 nm
>          multi-body bonded interactions  (-rdd)   1.200 nm
>  atoms separated by up to 9 constraints  (-rcon)  1.773 nm
>
> When dynamic load balancing gets turned on, these settings will change 
> to:
> The maximum number of communication pulses is: Y 1
> The minimum size for domain decomposition cells is 1.200 nm
> The requested allowed shrink of DD cells (option -dds) is: 0.80
> The allowed shrink of domain decomposition cells is: Y 0.68
> The maximum allowed distance for charge groups involved in 
> interactions is:
>                 non-bonded interactions           1.200 nm
>            two-body bonded interactions  (-rdd)   1.200 nm
>          multi-body bonded interactions  (-rdd)   1.200 nm
>  atoms separated by up to 9 constraints  (-rcon)  1.200 nm
>
>
> Making 1D domain decomposition grid 1 x 4 x 1, home cell index 0 0 0
>
> Center of mass motion removal mode is Linear
> We have the following groups for center of mass motion removal:
>  0:  rest
> There are: 46503 Atoms
> Charge group distribution at step 0: 4533 7043 7334 4581
> Grid: 10 x 6 x 17 cells
>
> Constraining the starting coordinates (step 0)
>
> Constraining the coordinates at t0-dt (step 0)
> RMS relative constraint deviation after constraining: 7.96e-07
> Initial temperature: 297.745 K
>
> Started mdrun on node 0 Fri Oct  8 14:46:51 2010
>
>           Step           Time         Lambda
>              0        0.00000        0.00000
>
>   Energies (kJ/mol)
>           Bond          Angle    Proper Dih. Ryckaert-Bell.          
> LJ-14
>    9.26629e+03    2.53358e+04    1.36779e+03    2.97600e+04    
> 1.20809e+04
>     Coulomb-14        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. 
> recip.
>    1.40505e+05    3.83498e+04   -2.30989e+03   -5.95333e+05   
> -1.96357e+05
>      Potential    Kinetic En.   Total Energy    Temperature Pres. DC 
> (bar)
>   -5.37334e+05    1.22595e+05   -4.14739e+05    2.97810e+02   
> -1.67546e+02
> Pressure (bar)   Constr. rmsd
>    2.67468e+00    1.03652e-06
>
> DD  step 9 load imb.: force 19.9%
>
> At step 10 the performance loss due to force load imbalance is 9.3 %
>
> NOTE: Turning on dynamic load balancing
>
> DD  step 999  vol min/aver 0.777  load imb.: force  0.1%
>
>           Step           Time         Lambda
>           1000        1.00000        0.00000
>
>   Energies (kJ/mol)
>           Bond          Angle    Proper Dih. Ryckaert-Bell.          
> LJ-14
>    9.29054e+03    2.49530e+04    1.43296e+03    2.96188e+04    
> 1.19777e+04
>     Coulomb-14        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. 
> recip.
>    1.40496e+05    3.99112e+04   -2.30308e+03   -5.96482e+05   
> -1.96429e+05
>      Potential    Kinetic En.   Total Energy    Temperature Pres. DC 
> (bar)
>   -5.37533e+05    1.22974e+05   -4.14560e+05    2.98729e+02   
> -1.66560e+02
> Pressure (bar)   Constr. rmsd
>   -1.40877e+02    1.04647e-06
>
> DD  step 1999  vol min/aver 0.773  load imb.: force  0.1%
>
> ................................................................................ 
>
>
>           Step           Time         Lambda
>          10000       10.00000        0.00000
>
> Writing checkpoint, step 10000 at Fri Oct  8 14:58:26 2010
>
>
>   Energies (kJ/mol)
>           Bond          Angle    Proper Dih. Ryckaert-Bell.          
> LJ-14
>    9.00658e+03    2.52059e+04    1.34920e+03    2.95995e+04    
> 1.19606e+04
>     Coulomb-14        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. 
> recip.
>    1.40474e+05    4.00471e+04   -2.30290e+03   -5.96601e+05   
> -1.96374e+05
>      Potential    Kinetic En.   Total Energy    Temperature Pres. DC 
> (bar)
>   -5.37636e+05    1.22577e+05   -4.15059e+05    2.97765e+02   
> -1.66533e+02
> Pressure (bar)   Constr. rmsd
>   -5.69272e+01    1.04191e-06
>
> <======  ###############  ==>
> <====  A V E R A G E S  ====>
> <==  ###############  ======>
>
> Statistics over 10001 steps using 1001 frames
>
>   Energies (kJ/mol)
>           Bond          Angle    Proper Dih. Ryckaert-Bell.          
> LJ-14
>    9.11274e+03    2.49545e+04    1.36688e+03    2.96269e+04    
> 1.20386e+04
>     Coulomb-14        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. 
> recip.
>    1.40680e+05    3.95513e+04   -2.30457e+03   -5.95701e+05   
> -1.96403e+05
>      Potential    Kinetic En.   Total Energy    Temperature Pres. DC 
> (bar)
>   -5.37077e+05    1.22248e+05   -4.14829e+05    2.96967e+02   
> -1.66776e+02
> Pressure (bar)   Constr. rmsd
>    4.32167e+00    0.00000e+00
>
>          Box-X          Box-Y          Box-Z
>    6.01417e+00    7.09874e+00    1.07493e+01
>
>   Total Virial (kJ/mol)
>    4.10242e+04   -9.05005e+00   -2.30129e+02
>   -1.20791e+01    4.05914e+04    1.71615e+02
>   -2.13770e+02    1.99254e+02    4.04540e+04
>
>   Pressure (bar)
>   -1.22617e+01   -8.98547e-01    1.93020e+01
>   -6.78561e-01    2.15880e+01   -8.68094e+00
>    1.81194e+01   -1.06823e+01    3.63870e+00
>
>   Total Dipole (D)
>    4.73145e+02   -1.30311e+03   -2.15240e+02
>
>  Epot (kJ/mol)        Coul-SR          LJ-SR        Coul-14          
> LJ-14
> glu242side-glu242side    2.99268e+00    0.00000e+00   -1.85865e+02 
> 1.35027e+00
> glu242side-rest   -5.15085e+01   -2.83484e+01    2.08195e+01    
> 4.24614e+00
>      rest-rest   -5.95653e+05    3.95797e+04    1.40846e+05    
> 1.20330e+04
>
>
> M E G A - F L O P S   A C C O U N T I N G
>
>   RF=Reaction-Field  FE=Free Energy  SCFE=Soft-Core/Free Energy
>   T=Tabulated        W3=SPC/TIP3p    W4=TIP4p (single or pairs)
>   NF=No Forces
>
> Computing:                               M-Number         M-Flops  % 
> Flops
> ----------------------------------------------------------------------------- 
>
> Coul(T)                              10781.556318      452825.365     4.6
> Coul(T) [W3]                            70.655819        8831.977     0.1
> Coul(T) + LJ                         34247.547832     1883615.131    18.9
> Coul(T) + LJ [W3]                     4684.616330      646477.054     6.5
> Coul(T) + LJ [W3-W3]                 12244.355656     4677343.861    47.0
> Outer nonbonded loop                  2334.588434       23345.884     0.2
> 1,4 nonbonded interactions             314.111408       28270.027     0.3
> Calc Weights                          1395.229509       50228.262     0.5
> Spread Q Bspline                    100456.524648      200913.049     2.0
> Gather F Bspline                    100456.524648      602739.148     6.1
> 3D-FFT                              105882.547196      847060.378     8.5
> Solve PME                             1490.549040       95395.139     1.0
> NS-Pairs                             12131.757207      254766.901     2.6
> Reset In Box                            23.514491          70.543     0.0
> CG-CoM                                  46.596006         139.788     0.0
> Bonds                                   61.886188        3651.285     0.0
> Angles                                 219.991997       36958.655     0.4
> Propers                                 23.872387        5466.777     0.1
> RB-Dihedrals                           253.765374       62680.047     0.6
> Virial                                  46.729683         841.134     0.0
> Stop-CM                                  0.465030           4.650     0.0
> P-Coupling                             465.076503        2790.459     0.0
> Calc-Ekin                              465.123006       12558.321     0.1
> Lincs                                   62.989618        3779.377     0.0
> Lincs-Mat                              569.895960        2279.584     0.0
> Constraint-V                           471.185790        3769.486     0.0
> Constraint-Vir                          40.852892         980.469     0.0
> Settle                                 115.084515       37172.298     0.4
> ----------------------------------------------------------------------------- 
>
> Total                                                 9944955.052   100.0
> ----------------------------------------------------------------------------- 
>
>
>
>    D O M A I N   D E C O M P O S I T I O N   S T A T I S T I C S
>
> av. #atoms communicated per step for force:  2 x 31556.3
> av. #atoms communicated per step for LINCS:  5 x 512.1
>
> Average load imbalance: 0.5 %
> Part of the total run time spent waiting due to load imbalance: 0.3 %
> Steps where the load balancing was limited by -rdd, -rcon and/or -dds: 
> Y 0 %
>
>
>     R E A L   C Y C L E   A N D   T I M E   A C C O U N T I N G
>
> Computing:         Nodes     Number     G-Cycles    Seconds     %
> -----------------------------------------------------------------------
> Domain decomp.         4       1001       56.954       21.4     0.8
> DD comm. load          4       1000        0.206        0.1     0.0
> DD comm. bounds        4       1000        2.836        1.1     0.0
> Comm. coord.           4      10001       30.480       11.5     0.4
> Neighbor search        4       1001      579.978      218.1     7.8
> Force                  4      10001     4548.315     1710.0    61.5
> Wait + Comm. F         4      10001       17.520        6.6     0.2
> PME mesh               4      10001     1897.783      713.5    25.7
> Write traj.            4          3        0.668        0.3     0.0
> Update                 4      10001       45.142       17.0     0.6
> Constraints            4      10001      181.826       68.4     2.5
> Comm. energies         4       1011        3.026        1.1     0.0
> Rest                   4                  31.895       12.0     0.4
> -----------------------------------------------------------------------
> Total                  4                7396.630     2780.9   100.0
> -----------------------------------------------------------------------
> -----------------------------------------------------------------------
> PME redist. X/F        4      20002      208.454       78.4     2.8
> PME spread/gather      4      20002     1440.827      541.7    19.5
> PME 3D-FFT             4      20002      203.508       76.5     2.8
> PME solve              4      10001       44.697       16.8     0.6
> -----------------------------------------------------------------------
>
> Parallel run - timing based on wallclock.
>
>               NODE (s)   Real (s)      (%)
>       Time:    695.218    695.218    100.0
>                       11:35
>               (Mnbf/s)   (GFlops)   (ns/day)  (hour/ns)
> Performance:    243.800     14.305      1.243     19.310
> Finished mdrun on node 0 Fri Oct  8 14:58:27 2010
>
>
> //////////////////////////////////////////////////////////////////////////////////////////////////// 
>
>                         :-)  G  R  O  M  A  C  S  (-:
>
>                   Groningen Machine for Chemical Simulation
>
>                   :-)  VERSION 4.5.1-dev-20101006-d3b58  (-:
>
>        Written by Emile Apol, Rossen Apostolov, Herman J.C. Berendsen,
>      Aldert van Buuren, Pär Bjelkmar, Rudi van Drunen, Anton Feenstra,
>        Gerrit Groenhof, Peter Kasson, Per Larsson, Pieter Meulenhoff,
>          Teemu Murtola, Szilard Pall, Sander Pronk, Roland Schultz,
>                Michael Shirts, Alfons Sijbers, Peter Tieleman,
>
>               Berk Hess, David van der Spoel, and Erik Lindahl.
>
>       Copyright (c) 1991-2000, University of Groningen, The Netherlands.
>            Copyright (c) 2001-2010, The GROMACS development team at
>        Uppsala University & The Royal Institute of Technology, Sweden.
>            check out http://www.gromacs.org for more information.
>
>         This program is free software; you can redistribute it and/or
>          modify it under the terms of the GNU General Public License
>         as published by the Free Software Foundation; either version 2
>             of the License, or (at your option) any later version.
>
> :-)  
> /home/leontyev/programs/bin/gromacs/gromacs-4.5.1-gpu/bin/mdrun-gpu (-:
>
> Input Parameters:
>   integrator           = md
>   nsteps               = 10000
>   init_step            = 0
>   ns_type              = Grid
>   nstlist              = 10
>   ndelta               = 2
>   nstcomm              = 1003
>   comm_mode            = Linear
>   nstlog               = 1000
>   nstxout              = 5000
>   nstvout              = 10000000
>   nstfout              = 0
>   nstcalcenergy        = 10
>   nstenergy            = 1000
>   nstxtcout            = 0
>   init_t               = 0
>   delta_t              = 0.001
>   xtcprec              = 1000
>   nkx                  = 54
>   nky                  = 60
>   nkz                  = 90
>   pme_order            = 6
>   ewald_rtol           = 1e-05
>   ewald_geometry       = 0
>   epsilon_surface      = 0
>   optimize_fft         = TRUE
>   ePBC                 = xyz
>   bPeriodicMols        = FALSE
>   bContinuation        = FALSE
>   bShakeSOR            = FALSE
>   etc                  = Andersen
>   nsttcouple           = 10
>   epc                  = Berendsen
>   epctype              = Isotropic
>   nstpcouple           = 10
>   tau_p                = 0.5
>   ref_p (3x3):
>      ref_p[    0]={ 1.01325e+00,  0.00000e+00,  0.00000e+00}
>      ref_p[    1]={ 0.00000e+00,  1.01325e+00,  0.00000e+00}
>      ref_p[    2]={ 0.00000e+00,  0.00000e+00,  1.01325e+00}
>   compress (3x3):
>      compress[    0]={ 4.50000e-05,  0.00000e+00,  0.00000e+00}
>      compress[    1]={ 0.00000e+00,  4.50000e-05,  0.00000e+00}
>      compress[    2]={ 0.00000e+00,  0.00000e+00,  4.50000e-05}
>   refcoord_scaling     = No
>   posres_com (3):
>      posres_com[0]= 0.00000e+00
>      posres_com[1]= 0.00000e+00
>      posres_com[2]= 0.00000e+00
>   posres_comB (3):
>      posres_comB[0]= 0.00000e+00
>      posres_comB[1]= 0.00000e+00
>      posres_comB[2]= 0.00000e+00
>   andersen_seed        = 815131
>   rlist                = 1.2
>   rlistlong            = 1.2
>   rtpi                 = 0.05
>   coulombtype          = PME
>   rcoulomb_switch      = 0
>   rcoulomb             = 1.2
>   vdwtype              = Cut-off
>   rvdw_switch          = 0
>   rvdw                 = 1.2
>   epsilon_r            = 1
>   epsilon_rf           = 1
>   tabext               = 1
>   implicit_solvent     = No
>   gb_algorithm         = Still
>   gb_epsilon_solvent   = 80
>   nstgbradii           = 1
>   rgbradii             = 1
>   gb_saltconc          = 0
>   gb_obc_alpha         = 1
>   gb_obc_beta          = 0.8
>   gb_obc_gamma         = 4.85
>   gb_dielectric_offset = 0.009
>   sa_algorithm         = Ace-approximation
>   sa_surface_tension   = 2.092
>   DispCorr             = EnerPres
>   free_energy          = no
>   init_lambda          = 0
>   delta_lambda         = 0
>   n_foreign_lambda     = 0
>   sc_alpha             = 0
>   sc_power             = 0
>   sc_sigma             = 0.3
>   sc_sigma_min         = 0.3
>   nstdhdl              = 10
>   separate_dhdl_file   = yes
>   dhdl_derivatives     = yes
>   dh_hist_size         = 0
>   dh_hist_spacing      = 0.1
>   nwall                = 0
>   wall_type            = 9-3
>   wall_atomtype[0]     = -1
>   wall_atomtype[1]     = -1
>   wall_density[0]      = 0
>   wall_density[1]      = 0
>   wall_ewald_zfac      = 3
>   pull                 = no
>   disre                = No
>   disre_weighting      = Conservative
>   disre_mixed          = FALSE
>   dr_fc                = 1000
>   dr_tau               = 0
>   nstdisreout          = 100
>   orires_fc            = 0
>   orires_tau           = 0
>   nstorireout          = 100
>   dihre-fc             = 1000
>   em_stepsize          = 0.01
>   em_tol               = 10
>   niter                = 20
>   fc_stepsize          = 0
>   nstcgsteep           = 1000
>   nbfgscorr            = 10
>   ConstAlg             = Lincs
>   shake_tol            = 0.0001
>   lincs_order          = 8
>   lincs_warnangle      = 30
>   lincs_iter           = 4
>   bd_fric              = 0
>   ld_seed              = 1993
>   cos_accel            = 0
>   deform (3x3):
>      deform[    0]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
>      deform[    1]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
>      deform[    2]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
>   userint1             = 0
>   userint2             = 0
>   userint3             = 0
>   userint4             = 0
>   userreal1            = 0
>   userreal2            = 0
>   userreal3            = 0
>   userreal4            = 0
> grpopts:
>   nrdf:       99021
>   ref_t:      298.15
>   tau_t:         0.3
> anneal:          No
> ann_npoints:           0
>   acc:            0           0           0
>   nfreeze:           Y           Y           Y           N           N N
>   energygrp_flags[  0]: 0 0
>   energygrp_flags[  1]: 0 0
>   efield-x:
>      n = 0
>   efield-xt:
>      n = 0
>   efield-y:
>      n = 0
>   efield-yt:
>      n = 0
>   efield-z:
>      n = 0
>   efield-zt:
>      n = 0
>   bQMMM                = FALSE
>   QMconstraints        = 0
>   QMMMscheme           = 0
>   scalefactor          = 1
> qm_opts:
>   ngQM                 = 0
> Table routines are used for coulomb: TRUE
> Table routines are used for vdw:     FALSE
> Will do PME sum in reciprocal space.
>
> Will do ordinary reciprocal space Ewald sum.
> Using a Gaussian width (1/beta) of 0.384195 nm for Ewald
> Cut-off's:   NS: 1.2   Coulomb: 1.2   LJ: 1.2
> Long Range LJ corr.: <C6> 4.0351e-04
> System total charge: -0.000
> Generated table with 1100 data points for Ewald.
> Tabscale = 500 points/nm
> Generated table with 1100 data points for LJ6.
> Tabscale = 500 points/nm
> Generated table with 1100 data points for LJ12.
> Tabscale = 500 points/nm
> Generated table with 1100 data points for 1-4 COUL.
> Tabscale = 500 points/nm
> Generated table with 1100 data points for 1-4 LJ6.
> Tabscale = 500 points/nm
> Generated table with 1100 data points for 1-4 LJ12.
> Tabscale = 500 points/nm
>
> Enabling SPC-like water optimization for 11505 molecules.
>
> Configuring nonbonded kernels...
> Configuring standard C nonbonded kernels...
>
>
> Removing pbc first time
>
> Initializing LINear Constraint Solver
>
> Center of mass motion removal mode is Linear
> We have the following groups for center of mass motion removal:
>  0:  rest
> Max number of connections per atom is 91
> Total number of connections is 387700
> Max number of graph edges per atom is 6
> Total number of graph edges is 70330
>
> OpenMM plugins loaded from directory 
> /home/leontyev/programs/bin/gromacs/OpenMM2.0-Linux64/lib/plugins: 
> libOpenMMCuda.so, libOpenMMOpenCL.so,
> The combination rule of the used force field matches the one used by 
> OpenMM.
> Gromacs will use the OpenMM platform: Cuda
> Gromacs will run on the GPU #0 (GeForce GTX 260).
> Pre-simulation ~15s memtest in progress...
> Memory test completed without errors.
>
> Constraining the starting coordinates (step 0)
>
> Constraining the coordinates at t0-dt (step 0)
> Initial temperature: 0 K
>
> Started mdrun on node 0 Fri Oct  8 16:54:04 2010
>
>           Step           Time         Lambda
>              0        0.00000        0.00000
>
>   Energies (kJ/mol)
>      Potential    Kinetic En.   Total Energy    Temperature   Constr. 
> rmsd
>   -5.34934e+05    1.22629e+05   -4.12305e+05    2.97883e+02    
> 1.03777e-06
>
>           Step           Time         Lambda
>           1000        1.00000        0.00000
>
>   Energies (kJ/mol)
>      Potential    Kinetic En.   Total Energy    Temperature   Constr. 
> rmsd
>   -5.42963e+05    1.16609e+05   -4.26354e+05    2.83260e+02    
> 1.03777e-06
>
>           Step           Time         Lambda
>           2000        2.00000        0.00000
>
>   Energies (kJ/mol)
>      Potential    Kinetic En.   Total Energy    Temperature   Constr. 
> rmsd
>   -5.49782e+05    1.14408e+05   -4.35374e+05    2.77912e+02    
> 1.03777e-06
>
>           Step           Time         Lambda
>           3000        3.00000        0.00000
>
>   Energies (kJ/mol)
>      Potential    Kinetic En.   Total Energy    Temperature   Constr. 
> rmsd
>   -5.51337e+05    1.12705e+05   -4.38631e+05    2.73777e+02    
> 1.03777e-06
>
>           Step           Time         Lambda
>           4000        4.00000        0.00000
>
>   Energies (kJ/mol)
>      Potential    Kinetic En.   Total Energy    Temperature   Constr. 
> rmsd
>   -5.52340e+05    1.12827e+05   -4.39513e+05    2.74073e+02    
> 1.03777e-06
>
>           Step           Time         Lambda
>           5000        5.00000        0.00000
>
>   Energies (kJ/mol)
>      Potential    Kinetic En.   Total Energy    Temperature   Constr. 
> rmsd
>   -5.52599e+05    1.13543e+05   -4.39056e+05    2.75812e+02    
> 1.03777e-06
>
>           Step           Time         Lambda
>           6000        6.00000        0.00000
>
>   Energies (kJ/mol)
>      Potential    Kinetic En.   Total Energy    Temperature   Constr. 
> rmsd
>   -5.52946e+05    1.14271e+05   -4.38675e+05    2.77580e+02    
> 1.03777e-06
>
>           Step           Time         Lambda
>           7000        7.00000        0.00000
>
>   Energies (kJ/mol)
>      Potential    Kinetic En.   Total Energy    Temperature   Constr. 
> rmsd
>   -5.51992e+05    1.13521e+05   -4.38471e+05    2.75759e+02    
> 1.03777e-06
>
>           Step           Time         Lambda
>           8000        8.00000        0.00000
>
>   Energies (kJ/mol)
>      Potential    Kinetic En.   Total Energy    Temperature   Constr. 
> rmsd
>   -5.52834e+05    1.14111e+05   -4.38723e+05    2.77192e+02    
> 1.03777e-06
>
>           Step           Time         Lambda
>           9000        9.00000        0.00000
>
>   Energies (kJ/mol)
>      Potential    Kinetic En.   Total Energy    Temperature   Constr. 
> rmsd
>   -5.52806e+05    1.13783e+05   -4.39022e+05    2.76396e+02    
> 1.03777e-06
>
>           Step           Time         Lambda
>          10000       10.00000        0.00000
>
>   Energies (kJ/mol)
>      Potential    Kinetic En.   Total Energy    Temperature   Constr. 
> rmsd
>   -5.53230e+05    1.12594e+05   -4.40636e+05    2.73506e+02    
> 1.03777e-06
>
> Writing checkpoint, step 10000 at Fri Oct  8 17:06:02 2010
>
>
> <======  ###############  ==>
> <====  A V E R A G E S  ====>
> <==  ###############  ======>
>
> Statistics over 11 steps using 11 frames
>
>   Energies (kJ/mol)
>      Potential    Kinetic En.   Total Energy    Temperature   Constr. 
> rmsd
>   -5.49797e+05    1.14636e+05   -4.35160e+05    2.78468e+02    
> 0.00000e+00
>
>          Box-X          Box-Y          Box-Z
>    1.73572e+12    1.19301e-40    2.31720e+11
>
>   Total Virial (kJ/mol)
>    0.00000e+00    0.00000e+00    0.00000e+00
>    0.00000e+00    0.00000e+00    0.00000e+00
>    0.00000e+00    0.00000e+00    0.00000e+00
>
>   Pressure (bar)
>    0.00000e+00    0.00000e+00    0.00000e+00
>    0.00000e+00    0.00000e+00    0.00000e+00
>    0.00000e+00    0.00000e+00    0.00000e+00
>
>   Total Dipole (D)
>    0.00000e+00    0.00000e+00    0.00000e+00
>
>  Epot (kJ/mol)        Coul-SR          LJ-SR        Coul-14          
> LJ-14
> glu242side-glu242side    0.00000e+00    0.00000e+00    0.00000e+00 
> 0.00000e+00
> glu242side-rest    0.00000e+00    0.00000e+00    0.00000e+00    
> 0.00000e+00
>      rest-rest    0.00000e+00    0.00000e+00    0.00000e+00    
> 0.00000e+00
>
> Post-simulation ~15s memtest in progress...
> Memory test completed without errors.
>
> M E G A - F L O P S   A C C O U N T I N G
>
>   RF=Reaction-Field  FE=Free Energy  SCFE=Soft-Core/Free Energy
>   T=Tabulated        W3=SPC/TIP3p    W4=TIP4p (single or pairs)
>   NF=No Forces
>
> Computing:                               M-Number         M-Flops  % 
> Flops
> ----------------------------------------------------------------------------- 
>
> Lincs                                    0.011934           0.716     8.0
> Lincs-Mat                                0.106200           0.425     4.7
> Constraint-V                             0.046449           0.372     4.2
> Settle                                   0.023010           7.432    83.1
> ----------------------------------------------------------------------------- 
>
> Total                                                       8.945   100.0
> ----------------------------------------------------------------------------- 
>
>
>
>     R E A L   C Y C L E   A N D   T I M E   A C C O U N T I N G
>
> Computing:         Nodes     Number     G-Cycles    Seconds     %
> -----------------------------------------------------------------------
> Write traj.            1         11        3.033        1.1     0.2
> Rest                   1                1978.521      716.9    99.8
> -----------------------------------------------------------------------
> Total                  1                1981.554      718.0   100.0
> -----------------------------------------------------------------------
>
> OpenMM run - timing based on wallclock.
>
>               NODE (s)   Real (s)      (%)
>       Time:    717.970    717.970    100.0
>                       11:57
>               (Mnbf/s)   (MFlops)   (ns/day)  (hour/ns)
> Performance:      0.000      0.012      1.204     19.942
> Finished mdrun on node 0 Fri Oct  8 17:06:02 2010
> //////////////////////////////////////////////////////////////////////////////////////////////////// 
>
>
>> Igor Leontyev wrote:
>>
>> Finally, I compiled and ran simulations with gpu version of 
>> gromacs-4.5.1.
>> There were several issues:
>>
>> 1) Precompiled OpenMM2.0 libraries and headers must be downloaded (which
>> requires registration on their web page) and installed, otherwise cmake
>> doesn't find some source files.
>>
>> 2) cmake should be called outside the original source directory with the
>> path of the directory as an argument.
>>
>> 3) To run the obtained mdrun-gpu binary the CUDA dev driver should be
>> installed, otherwise the program does not find 'CUDA'. This step 
>> appeared to
>> be the most problematic for me. According to OpenMM manual the driver 
>> must
>> be installed with turned off x-windows service which can be done by the
>> command "init 3". In Ubuntu this command has no effect, while 
>> switching the
>> graphical interface off/on is done by
>>
>> "sudo service gdm stop/start"
>>
>> It turned out that in Ubuntu-10.04 the CUDA driver installation 
>> script does
>> not work properly even with turned off gdm. This issue and its 
>> solution is
>> described at http://ubuntuforums.org/showthread.php?t=1467074
>>
>> Thank you for comments,
>>
>> Igor
>>>
>>>
>>> Szilárd Páll wrote:
>>> Dear Igor,
>>>
>>> Your output look _very_ weird, it seems as if CMake internal
>>> variable(s) were not initialized, which I have no clue how could have
>>> happened - the build generator works just fine for me. The only thing
>>> I can think of is that maybe your CMakeCache is corrupted.
>>>
>>> Could you please rerun cmake in a _clean_ build directory? Also, are
>>> you able to run cmake for CPU build (no -D options)?
>>>
>>> -- 
>>> Szilárd
>>>
>>>> Szilárd wrote:
>>>>>
>>>>> The beta versions are all outdated, could you please use the latest
>>>>> source distribution (4.5.1) instead (or git from the
>>>>> release-4-5-patches branch)?
>>>>
>>>> The result is the same for both the distribution 4.5.1 and git from 
>>>> the
>>>> release-4-5-patches. See the output bellow.
>>>> =========================================
>>>>
>>>> PATH=/usr/local/opt/bin/mpi/openmpi-1.4.2/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games 
>>>>
>>>> LD_LIBRARY_PATH=/usr/local/opt/bin/mpi/openmpi-1.4.2/lib:/home/leontyev/programs/bin/cuda/lib64: 
>>>>
>>>> CPPFLAGS=-I//usr/local/opt/bin/gromacs/fftw-3.2.2/single_sse/include
>>>> -I//usr/local/opt/bin/mpi/openmpi-1.4.2/include
>>>> LDFLAGS=-L//usr/local/opt/bin/gromacs/fftw-3.2.2/single_sse/lib
>>>> -L//usr/local/opt/bin/mpi/openmpi-1.4.2/lib
>>>> OPENMM_ROOT_DIR=/home/leontyev/programs/bin/gromacs/gromacs-4.5.1-git/openmm 
>>>>
>>>>
>>>> cmake src -DGMX_OPENMM=ON -DGMX_THREADS=OFF
>>>> -DCMAKE_INSTALL_PREFIX=/home/leontyev/programs/bin/gromacs/gromacs-4.5.1-git 
>>>>
>>>> CMake Error at gmxlib/CMakeLists.txt:124 (set_target_properties):
>>>> set_target_properties called with incorrect number of arguments.
>>>>
>>>>
>>>> CMake Error at gmxlib/CMakeLists.txt:126 (install):
>>>> install TARGETS given no ARCHIVE DESTINATION for static library target
>>>> "gmx".
>>>>
>>>>
>>>> CMake Error at mdlib/CMakeLists.txt:11 (set_target_properties):
>>>> set_target_properties called with incorrect number of arguments.
>>>>
>>>>
>>>> CMake Error at mdlib/CMakeLists.txt:13 (install):
>>>> install TARGETS given no ARCHIVE DESTINATION for static library target
>>>> "md".
>>>>
>>>>
>>>> CMake Error at kernel/CMakeLists.txt:43 (set_target_properties):
>>>> set_target_properties called with incorrect number of arguments.
>>>>
>>>>
>>>> CMake Error at kernel/CMakeLists.txt:44 (set_target_properties):
>>>> set_target_properties called with incorrect number of arguments.
>>>>
>>>>
>>>> CMake Error at kernel/gmx_gpu_utils/CMakeLists.txt:18
>>>> (CUDA_INCLUDE_DIRECTORIES):
>>>> Unknown CMake command "CUDA_INCLUDE_DIRECTORIES".
>>>>
>>>>
>>>> CMake Warning (dev) in CMakeLists.txt:
>>>> No cmake_minimum_required command is present. A line of code such as
>>>>
>>>> cmake_minimum_required(VERSION 2.8)
>>>>
>>>> should be added at the top of the file. The version specified may be
>>>> lower
>>>> if you wish to support older CMake versions for this project. For more
>>>> information run "cmake --help-policy CMP0000".
>>>> This warning is for project developers. Use -Wno-dev to suppress it.
>>>>
>>>> -- Configuring incomplete, errors occurred! 
>
>




More information about the gromacs.org_gmx-users mailing list