[gmx-users] Simulations on GPU

Igor Leontyev ileontyev at ucdavis.edu
Tue Oct 12 03:00:28 CEST 2010


Now I am able to run simulations on GPU but the output is weird. For 
example, temperature drops down to 270K while ref_t=298 (Tcoupl=andersen). 
Moreover, after several hours of simulations mdrun-gpu starts to output 
"NAN" energies and hangs up. Pre-run and post-run GPU memory test is always 
passed. The graphics card is that provided with HP desktops (might be MSI) 
NVIDIA GTX260 with 1.8Gb memory. The output of mdrun and mdrun-gpu versions 
of Gromacs is given bellow. Any ideas? Thanks.

Igor

////////////////////////////////////////////////////////////////////////////////////////////////////
Log file opened on Fri Oct  8 14:46:51 2010
Host: powerpc  pid: 32083  nodeid: 0  nnodes:  4
The Gromacs distribution was built Thu Sep 30 14:42:48 PDT 2010 by
leontyev at powerpc (Linux 2.6.32-22-generic x86_64)


                         :-)  G  R  O  M  A  C  S  (-:

               Gromacs Runs One Microsecond At Cannonball Speeds

                            :-)  VERSION 4.5.1  (-:

        Written by Emile Apol, Rossen Apostolov, Herman J.C. Berendsen,
      Aldert van Buuren, Pär Bjelkmar, Rudi van Drunen, Anton Feenstra,
        Gerrit Groenhof, Peter Kasson, Per Larsson, Peiter Meulenhoff,
          Teemu Murtola, Szilard Pall, Sander Pronk, Roland Schultz,
                Michael Shirts, Alfons Sijbers, Peter Tieleman,

               Berk Hess, David van der Spoel, and Erik Lindahl.

       Copyright (c) 1991-2000, University of Groningen, The Netherlands.
            Copyright (c) 2001-2010, The GROMACS development team at
        Uppsala University & The Royal Institute of Technology, Sweden.
            check out http://www.gromacs.org for more information.

         This program is free software; you can redistribute it and/or
          modify it under the terms of the GNU General Public License
         as published by the Free Software Foundation; either version 2
             of the License, or (at your option) any later version.

      :-)  /usr/local/opt/bin/gromacs/gromacs-4.5.1/bin/mdrun_mpich2  (-:



Input Parameters:
   integrator           = md
   nsteps               = 10000
   init_step            = 0
   ns_type              = Grid
   nstlist              = 10
   ndelta               = 2
   nstcomm              = 1003
   comm_mode            = Linear
   nstlog               = 1000
   nstxout              = 5000
   nstvout              = 10000000
   nstfout              = 0
   nstcalcenergy        = 10
   nstenergy            = 1000
   nstxtcout            = 0
   init_t               = 0
   delta_t              = 0.001
   xtcprec              = 1000
   nkx                  = 54
   nky                  = 60
   nkz                  = 90
   pme_order            = 6
   ewald_rtol           = 1e-05
   ewald_geometry       = 0
   epsilon_surface      = 0
   optimize_fft         = TRUE
   ePBC                 = xyz
   bPeriodicMols        = FALSE
   bContinuation        = FALSE
   bShakeSOR            = FALSE
   etc                  = Andersen
   nsttcouple           = 10
   epc                  = Berendsen
   epctype              = Isotropic
   nstpcouple           = 10
   tau_p                = 0.5
   ref_p (3x3):
      ref_p[    0]={ 1.01325e+00,  0.00000e+00,  0.00000e+00}
      ref_p[    1]={ 0.00000e+00,  1.01325e+00,  0.00000e+00}
      ref_p[    2]={ 0.00000e+00,  0.00000e+00,  1.01325e+00}
   compress (3x3):
      compress[    0]={ 4.50000e-05,  0.00000e+00,  0.00000e+00}
      compress[    1]={ 0.00000e+00,  4.50000e-05,  0.00000e+00}
      compress[    2]={ 0.00000e+00,  0.00000e+00,  4.50000e-05}
   refcoord_scaling     = No
   posres_com (3):
      posres_com[0]= 0.00000e+00
      posres_com[1]= 0.00000e+00
      posres_com[2]= 0.00000e+00
   posres_comB (3):
      posres_comB[0]= 0.00000e+00
      posres_comB[1]= 0.00000e+00
      posres_comB[2]= 0.00000e+00
   andersen_seed        = 815131
   rlist                = 1.2
   rlistlong            = 1.2
   rtpi                 = 0.05
   coulombtype          = PME
   rcoulomb_switch      = 0
   rcoulomb             = 1.2
   vdwtype              = Cut-off
   rvdw_switch          = 0
   rvdw                 = 1.2
   epsilon_r            = 1
   epsilon_rf           = 1
   tabext               = 1
   implicit_solvent     = No
   gb_algorithm         = Still
   gb_epsilon_solvent   = 80
   nstgbradii           = 1
   rgbradii             = 1
   gb_saltconc          = 0
   gb_obc_alpha         = 1
   gb_obc_beta          = 0.8
   gb_obc_gamma         = 4.85
   gb_dielectric_offset = 0.009
   sa_algorithm         = No
   sa_surface_tension   = 2.092
   DispCorr             = EnerPres
   free_energy          = no
   init_lambda          = 0
   delta_lambda         = 0
   n_foreign_lambda     = 0
   sc_alpha             = 0
   sc_power             = 0
   sc_sigma             = 0.3
   sc_sigma_min         = 0.3
   nstdhdl              = 10
   separate_dhdl_file   = yes
   dhdl_derivatives     = yes
   dh_hist_size         = 0
   dh_hist_spacing      = 0.1
   nwall                = 0
   wall_type            = 9-3
   wall_atomtype[0]     = -1
   wall_atomtype[1]     = -1
   wall_density[0]      = 0
   wall_density[1]      = 0
   wall_ewald_zfac      = 3
   pull                 = no
   disre                = No
   disre_weighting      = Conservative
   disre_mixed          = FALSE
   dr_fc                = 1000
   dr_tau               = 0
   nstdisreout          = 100
   orires_fc            = 0
   orires_tau           = 0
   nstorireout          = 100
   dihre-fc             = 1000
   em_stepsize          = 0.01
   em_tol               = 10
   niter                = 20
   fc_stepsize          = 0
   nstcgsteep           = 1000
   nbfgscorr            = 10
   ConstAlg             = Lincs
   shake_tol            = 0.0001
   lincs_order          = 8
   lincs_warnangle      = 30
   lincs_iter           = 4
   bd_fric              = 0
   ld_seed              = 1993
   cos_accel            = 0
   deform (3x3):
      deform[    0]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
      deform[    1]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
      deform[    2]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
   userint1             = 0
   userint2             = 0
   userint3             = 0
   userint4             = 0
   userreal1            = 0
   userreal2            = 0
   userreal3            = 0
   userreal4            = 0
grpopts:
   nrdf:       99021
   ref_t:      298.15
   tau_t:         0.3
anneal:          No
ann_npoints:           0
   acc:            0           0           0
   nfreeze:           Y           Y           Y           N           N 
N
   energygrp_flags[  0]: 0 0
   energygrp_flags[  1]: 0 0
   efield-x:
      n = 0
   efield-xt:
      n = 0
   efield-y:
      n = 0
   efield-yt:
      n = 0
   efield-z:
      n = 0
   efield-zt:
      n = 0
   bQMMM                = FALSE
   QMconstraints        = 0
   QMMMscheme           = 0
   scalefactor          = 1
qm_opts:
   ngQM                 = 0

Initializing Domain Decomposition on 4 nodes
Dynamic load balancing: auto
Will sort the charge groups at every domain (re)decomposition
Initial maximum inter charge-group distances:
    two-body bonded interactions: 0.585 nm, LJ-14, atoms 10901 11433
  multi-body bonded interactions: 0.482 nm, Ryckaert-Bell., atoms 11431 
11935
Minimum cell size due to bonded interactions: 0.530 nm
Maximum distance for 9 constraints, at 120 deg. angles, all-trans: 0.218 nm
Estimated maximum distance required for P-LINCS: 0.218 nm
Using 0 separate PME nodes
Scaling the initial minimum size with 1/0.8 (option -dds) = 1.25
Optimizing the DD grid for 4 cells with a minimum initial size of 0.663 nm
The maximum allowed number of cells is: X 9 Y 10 Z 16
Domain decomposition grid 1 x 4 x 1, separate PME nodes 0
PME domain decomposition: 1 x 4 x 1
Domain decomposition nodeid 0, coordinates 0 0 0

Table routines are used for coulomb: TRUE
Table routines are used for vdw:     FALSE
Will do PME sum in reciprocal space.

Will do ordinary reciprocal space Ewald sum.
Using a Gaussian width (1/beta) of 0.384195 nm for Ewald
Cut-off's:   NS: 1.2   Coulomb: 1.2   LJ: 1.2
Long Range LJ corr.: <C6> 4.0351e-04
System total charge: -0.000
Generated table with 1100 data points for Ewald.
Tabscale = 500 points/nm
Generated table with 1100 data points for LJ6.
Tabscale = 500 points/nm
Generated table with 1100 data points for LJ12.
Tabscale = 500 points/nm
Generated table with 1100 data points for 1-4 COUL.
Tabscale = 500 points/nm
Generated table with 1100 data points for 1-4 LJ6.
Tabscale = 500 points/nm
Generated table with 1100 data points for 1-4 LJ12.
Tabscale = 500 points/nm

Enabling SPC-like water optimization for 11505 molecules.

Configuring nonbonded kernels...
Configuring standard C nonbonded kernels...
Testing x86_64 SSE2 support... present.


Removing pbc first time

Initializing Parallel LINear Constraint Solver

Linking all bonded interactions to atoms
There are 65716 inter charge-group exclusions,
will use an extra communication step for exclusion forces for PME

The initial number of communication pulses is: Y 1
The initial domain decomposition cell size is: Y 1.77 nm

The maximum allowed distance for charge groups involved in interactions is:
                 non-bonded interactions           1.200 nm
(the following are initial values, they could change due to box deformation)
            two-body bonded interactions  (-rdd)   1.200 nm
          multi-body bonded interactions  (-rdd)   1.200 nm
  atoms separated by up to 9 constraints  (-rcon)  1.773 nm

When dynamic load balancing gets turned on, these settings will change to:
The maximum number of communication pulses is: Y 1
The minimum size for domain decomposition cells is 1.200 nm
The requested allowed shrink of DD cells (option -dds) is: 0.80
The allowed shrink of domain decomposition cells is: Y 0.68
The maximum allowed distance for charge groups involved in interactions is:
                 non-bonded interactions           1.200 nm
            two-body bonded interactions  (-rdd)   1.200 nm
          multi-body bonded interactions  (-rdd)   1.200 nm
  atoms separated by up to 9 constraints  (-rcon)  1.200 nm


Making 1D domain decomposition grid 1 x 4 x 1, home cell index 0 0 0

Center of mass motion removal mode is Linear
We have the following groups for center of mass motion removal:
  0:  rest
There are: 46503 Atoms
Charge group distribution at step 0: 4533 7043 7334 4581
Grid: 10 x 6 x 17 cells

Constraining the starting coordinates (step 0)

Constraining the coordinates at t0-dt (step 0)
RMS relative constraint deviation after constraining: 7.96e-07
Initial temperature: 297.745 K

Started mdrun on node 0 Fri Oct  8 14:46:51 2010

           Step           Time         Lambda
              0        0.00000        0.00000

   Energies (kJ/mol)
           Bond          Angle    Proper Dih. Ryckaert-Bell.          LJ-14
    9.26629e+03    2.53358e+04    1.36779e+03    2.97600e+04    1.20809e+04
     Coulomb-14        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.
    1.40505e+05    3.83498e+04   -2.30989e+03   -5.95333e+05   -1.96357e+05
      Potential    Kinetic En.   Total Energy    Temperature Pres. DC (bar)
   -5.37334e+05    1.22595e+05   -4.14739e+05    2.97810e+02   -1.67546e+02
 Pressure (bar)   Constr. rmsd
    2.67468e+00    1.03652e-06

DD  step 9 load imb.: force 19.9%

At step 10 the performance loss due to force load imbalance is 9.3 %

NOTE: Turning on dynamic load balancing

DD  step 999  vol min/aver 0.777  load imb.: force  0.1%

           Step           Time         Lambda
           1000        1.00000        0.00000

   Energies (kJ/mol)
           Bond          Angle    Proper Dih. Ryckaert-Bell.          LJ-14
    9.29054e+03    2.49530e+04    1.43296e+03    2.96188e+04    1.19777e+04
     Coulomb-14        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.
    1.40496e+05    3.99112e+04   -2.30308e+03   -5.96482e+05   -1.96429e+05
      Potential    Kinetic En.   Total Energy    Temperature Pres. DC (bar)
   -5.37533e+05    1.22974e+05   -4.14560e+05    2.98729e+02   -1.66560e+02
 Pressure (bar)   Constr. rmsd
   -1.40877e+02    1.04647e-06

DD  step 1999  vol min/aver 0.773  load imb.: force  0.1%

................................................................................

           Step           Time         Lambda
          10000       10.00000        0.00000

Writing checkpoint, step 10000 at Fri Oct  8 14:58:26 2010


   Energies (kJ/mol)
           Bond          Angle    Proper Dih. Ryckaert-Bell.          LJ-14
    9.00658e+03    2.52059e+04    1.34920e+03    2.95995e+04    1.19606e+04
     Coulomb-14        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.
    1.40474e+05    4.00471e+04   -2.30290e+03   -5.96601e+05   -1.96374e+05
      Potential    Kinetic En.   Total Energy    Temperature Pres. DC (bar)
   -5.37636e+05    1.22577e+05   -4.15059e+05    2.97765e+02   -1.66533e+02
 Pressure (bar)   Constr. rmsd
   -5.69272e+01    1.04191e-06

 <======  ###############  ==>
 <====  A V E R A G E S  ====>
 <==  ###############  ======>

 Statistics over 10001 steps using 1001 frames

   Energies (kJ/mol)
           Bond          Angle    Proper Dih. Ryckaert-Bell.          LJ-14
    9.11274e+03    2.49545e+04    1.36688e+03    2.96269e+04    1.20386e+04
     Coulomb-14        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.
    1.40680e+05    3.95513e+04   -2.30457e+03   -5.95701e+05   -1.96403e+05
      Potential    Kinetic En.   Total Energy    Temperature Pres. DC (bar)
   -5.37077e+05    1.22248e+05   -4.14829e+05    2.96967e+02   -1.66776e+02
 Pressure (bar)   Constr. rmsd
    4.32167e+00    0.00000e+00

          Box-X          Box-Y          Box-Z
    6.01417e+00    7.09874e+00    1.07493e+01

   Total Virial (kJ/mol)
    4.10242e+04   -9.05005e+00   -2.30129e+02
   -1.20791e+01    4.05914e+04    1.71615e+02
   -2.13770e+02    1.99254e+02    4.04540e+04

   Pressure (bar)
   -1.22617e+01   -8.98547e-01    1.93020e+01
   -6.78561e-01    2.15880e+01   -8.68094e+00
    1.81194e+01   -1.06823e+01    3.63870e+00

   Total Dipole (D)
    4.73145e+02   -1.30311e+03   -2.15240e+02

  Epot (kJ/mol)        Coul-SR          LJ-SR        Coul-14          LJ-14
glu242side-glu242side    2.99268e+00    0.00000e+00   -1.85865e+02 
1.35027e+00
glu242side-rest   -5.15085e+01   -2.83484e+01    2.08195e+01    4.24614e+00
      rest-rest   -5.95653e+05    3.95797e+04    1.40846e+05    1.20330e+04


 M E G A - F L O P S   A C C O U N T I N G

   RF=Reaction-Field  FE=Free Energy  SCFE=Soft-Core/Free Energy
   T=Tabulated        W3=SPC/TIP3p    W4=TIP4p (single or pairs)
   NF=No Forces

 Computing:                               M-Number         M-Flops  % Flops
-----------------------------------------------------------------------------
 Coul(T)                              10781.556318      452825.365     4.6
 Coul(T) [W3]                            70.655819        8831.977     0.1
 Coul(T) + LJ                         34247.547832     1883615.131    18.9
 Coul(T) + LJ [W3]                     4684.616330      646477.054     6.5
 Coul(T) + LJ [W3-W3]                 12244.355656     4677343.861    47.0
 Outer nonbonded loop                  2334.588434       23345.884     0.2
 1,4 nonbonded interactions             314.111408       28270.027     0.3
 Calc Weights                          1395.229509       50228.262     0.5
 Spread Q Bspline                    100456.524648      200913.049     2.0
 Gather F Bspline                    100456.524648      602739.148     6.1
 3D-FFT                              105882.547196      847060.378     8.5
 Solve PME                             1490.549040       95395.139     1.0
 NS-Pairs                             12131.757207      254766.901     2.6
 Reset In Box                            23.514491          70.543     0.0
 CG-CoM                                  46.596006         139.788     0.0
 Bonds                                   61.886188        3651.285     0.0
 Angles                                 219.991997       36958.655     0.4
 Propers                                 23.872387        5466.777     0.1
 RB-Dihedrals                           253.765374       62680.047     0.6
 Virial                                  46.729683         841.134     0.0
 Stop-CM                                  0.465030           4.650     0.0
 P-Coupling                             465.076503        2790.459     0.0
 Calc-Ekin                              465.123006       12558.321     0.1
 Lincs                                   62.989618        3779.377     0.0
 Lincs-Mat                              569.895960        2279.584     0.0
 Constraint-V                           471.185790        3769.486     0.0
 Constraint-Vir                          40.852892         980.469     0.0
 Settle                                 115.084515       37172.298     0.4
-----------------------------------------------------------------------------
 Total                                                 9944955.052   100.0
-----------------------------------------------------------------------------


    D O M A I N   D E C O M P O S I T I O N   S T A T I S T I C S

 av. #atoms communicated per step for force:  2 x 31556.3
 av. #atoms communicated per step for LINCS:  5 x 512.1

 Average load imbalance: 0.5 %
 Part of the total run time spent waiting due to load imbalance: 0.3 %
 Steps where the load balancing was limited by -rdd, -rcon and/or -dds: Y 0 
%


     R E A L   C Y C L E   A N D   T I M E   A C C O U N T I N G

 Computing:         Nodes     Number     G-Cycles    Seconds     %
-----------------------------------------------------------------------
 Domain decomp.         4       1001       56.954       21.4     0.8
 DD comm. load          4       1000        0.206        0.1     0.0
 DD comm. bounds        4       1000        2.836        1.1     0.0
 Comm. coord.           4      10001       30.480       11.5     0.4
 Neighbor search        4       1001      579.978      218.1     7.8
 Force                  4      10001     4548.315     1710.0    61.5
 Wait + Comm. F         4      10001       17.520        6.6     0.2
 PME mesh               4      10001     1897.783      713.5    25.7
 Write traj.            4          3        0.668        0.3     0.0
 Update                 4      10001       45.142       17.0     0.6
 Constraints            4      10001      181.826       68.4     2.5
 Comm. energies         4       1011        3.026        1.1     0.0
 Rest                   4                  31.895       12.0     0.4
-----------------------------------------------------------------------
 Total                  4                7396.630     2780.9   100.0
-----------------------------------------------------------------------
-----------------------------------------------------------------------
 PME redist. X/F        4      20002      208.454       78.4     2.8
 PME spread/gather      4      20002     1440.827      541.7    19.5
 PME 3D-FFT             4      20002      203.508       76.5     2.8
 PME solve              4      10001       44.697       16.8     0.6
-----------------------------------------------------------------------

 Parallel run - timing based on wallclock.

               NODE (s)   Real (s)      (%)
       Time:    695.218    695.218    100.0
                       11:35
               (Mnbf/s)   (GFlops)   (ns/day)  (hour/ns)
Performance:    243.800     14.305      1.243     19.310
Finished mdrun on node 0 Fri Oct  8 14:58:27 2010


////////////////////////////////////////////////////////////////////////////////////////////////////
                         :-)  G  R  O  M  A  C  S  (-:

                   Groningen Machine for Chemical Simulation

                   :-)  VERSION 4.5.1-dev-20101006-d3b58  (-:

        Written by Emile Apol, Rossen Apostolov, Herman J.C. Berendsen,
      Aldert van Buuren, Pär Bjelkmar, Rudi van Drunen, Anton Feenstra,
        Gerrit Groenhof, Peter Kasson, Per Larsson, Pieter Meulenhoff,
          Teemu Murtola, Szilard Pall, Sander Pronk, Roland Schultz,
                Michael Shirts, Alfons Sijbers, Peter Tieleman,

               Berk Hess, David van der Spoel, and Erik Lindahl.

       Copyright (c) 1991-2000, University of Groningen, The Netherlands.
            Copyright (c) 2001-2010, The GROMACS development team at
        Uppsala University & The Royal Institute of Technology, Sweden.
            check out http://www.gromacs.org for more information.

         This program is free software; you can redistribute it and/or
          modify it under the terms of the GNU General Public License
         as published by the Free Software Foundation; either version 2
             of the License, or (at your option) any later version.

 :-)  /home/leontyev/programs/bin/gromacs/gromacs-4.5.1-gpu/bin/mdrun-gpu 
(-:

Input Parameters:
   integrator           = md
   nsteps               = 10000
   init_step            = 0
   ns_type              = Grid
   nstlist              = 10
   ndelta               = 2
   nstcomm              = 1003
   comm_mode            = Linear
   nstlog               = 1000
   nstxout              = 5000
   nstvout              = 10000000
   nstfout              = 0
   nstcalcenergy        = 10
   nstenergy            = 1000
   nstxtcout            = 0
   init_t               = 0
   delta_t              = 0.001
   xtcprec              = 1000
   nkx                  = 54
   nky                  = 60
   nkz                  = 90
   pme_order            = 6
   ewald_rtol           = 1e-05
   ewald_geometry       = 0
   epsilon_surface      = 0
   optimize_fft         = TRUE
   ePBC                 = xyz
   bPeriodicMols        = FALSE
   bContinuation        = FALSE
   bShakeSOR            = FALSE
   etc                  = Andersen
   nsttcouple           = 10
   epc                  = Berendsen
   epctype              = Isotropic
   nstpcouple           = 10
   tau_p                = 0.5
   ref_p (3x3):
      ref_p[    0]={ 1.01325e+00,  0.00000e+00,  0.00000e+00}
      ref_p[    1]={ 0.00000e+00,  1.01325e+00,  0.00000e+00}
      ref_p[    2]={ 0.00000e+00,  0.00000e+00,  1.01325e+00}
   compress (3x3):
      compress[    0]={ 4.50000e-05,  0.00000e+00,  0.00000e+00}
      compress[    1]={ 0.00000e+00,  4.50000e-05,  0.00000e+00}
      compress[    2]={ 0.00000e+00,  0.00000e+00,  4.50000e-05}
   refcoord_scaling     = No
   posres_com (3):
      posres_com[0]= 0.00000e+00
      posres_com[1]= 0.00000e+00
      posres_com[2]= 0.00000e+00
   posres_comB (3):
      posres_comB[0]= 0.00000e+00
      posres_comB[1]= 0.00000e+00
      posres_comB[2]= 0.00000e+00
   andersen_seed        = 815131
   rlist                = 1.2
   rlistlong            = 1.2
   rtpi                 = 0.05
   coulombtype          = PME
   rcoulomb_switch      = 0
   rcoulomb             = 1.2
   vdwtype              = Cut-off
   rvdw_switch          = 0
   rvdw                 = 1.2
   epsilon_r            = 1
   epsilon_rf           = 1
   tabext               = 1
   implicit_solvent     = No
   gb_algorithm         = Still
   gb_epsilon_solvent   = 80
   nstgbradii           = 1
   rgbradii             = 1
   gb_saltconc          = 0
   gb_obc_alpha         = 1
   gb_obc_beta          = 0.8
   gb_obc_gamma         = 4.85
   gb_dielectric_offset = 0.009
   sa_algorithm         = Ace-approximation
   sa_surface_tension   = 2.092
   DispCorr             = EnerPres
   free_energy          = no
   init_lambda          = 0
   delta_lambda         = 0
   n_foreign_lambda     = 0
   sc_alpha             = 0
   sc_power             = 0
   sc_sigma             = 0.3
   sc_sigma_min         = 0.3
   nstdhdl              = 10
   separate_dhdl_file   = yes
   dhdl_derivatives     = yes
   dh_hist_size         = 0
   dh_hist_spacing      = 0.1
   nwall                = 0
   wall_type            = 9-3
   wall_atomtype[0]     = -1
   wall_atomtype[1]     = -1
   wall_density[0]      = 0
   wall_density[1]      = 0
   wall_ewald_zfac      = 3
   pull                 = no
   disre                = No
   disre_weighting      = Conservative
   disre_mixed          = FALSE
   dr_fc                = 1000
   dr_tau               = 0
   nstdisreout          = 100
   orires_fc            = 0
   orires_tau           = 0
   nstorireout          = 100
   dihre-fc             = 1000
   em_stepsize          = 0.01
   em_tol               = 10
   niter                = 20
   fc_stepsize          = 0
   nstcgsteep           = 1000
   nbfgscorr            = 10
   ConstAlg             = Lincs
   shake_tol            = 0.0001
   lincs_order          = 8
   lincs_warnangle      = 30
   lincs_iter           = 4
   bd_fric              = 0
   ld_seed              = 1993
   cos_accel            = 0
   deform (3x3):
      deform[    0]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
      deform[    1]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
      deform[    2]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
   userint1             = 0
   userint2             = 0
   userint3             = 0
   userint4             = 0
   userreal1            = 0
   userreal2            = 0
   userreal3            = 0
   userreal4            = 0
grpopts:
   nrdf:       99021
   ref_t:      298.15
   tau_t:         0.3
anneal:          No
ann_npoints:           0
   acc:            0           0           0
   nfreeze:           Y           Y           Y           N           N 
N
   energygrp_flags[  0]: 0 0
   energygrp_flags[  1]: 0 0
   efield-x:
      n = 0
   efield-xt:
      n = 0
   efield-y:
      n = 0
   efield-yt:
      n = 0
   efield-z:
      n = 0
   efield-zt:
      n = 0
   bQMMM                = FALSE
   QMconstraints        = 0
   QMMMscheme           = 0
   scalefactor          = 1
qm_opts:
   ngQM                 = 0
Table routines are used for coulomb: TRUE
Table routines are used for vdw:     FALSE
Will do PME sum in reciprocal space.

Will do ordinary reciprocal space Ewald sum.
Using a Gaussian width (1/beta) of 0.384195 nm for Ewald
Cut-off's:   NS: 1.2   Coulomb: 1.2   LJ: 1.2
Long Range LJ corr.: <C6> 4.0351e-04
System total charge: -0.000
Generated table with 1100 data points for Ewald.
Tabscale = 500 points/nm
Generated table with 1100 data points for LJ6.
Tabscale = 500 points/nm
Generated table with 1100 data points for LJ12.
Tabscale = 500 points/nm
Generated table with 1100 data points for 1-4 COUL.
Tabscale = 500 points/nm
Generated table with 1100 data points for 1-4 LJ6.
Tabscale = 500 points/nm
Generated table with 1100 data points for 1-4 LJ12.
Tabscale = 500 points/nm

Enabling SPC-like water optimization for 11505 molecules.

Configuring nonbonded kernels...
Configuring standard C nonbonded kernels...


Removing pbc first time

Initializing LINear Constraint Solver

Center of mass motion removal mode is Linear
We have the following groups for center of mass motion removal:
  0:  rest
Max number of connections per atom is 91
Total number of connections is 387700
Max number of graph edges per atom is 6
Total number of graph edges is 70330

OpenMM plugins loaded from directory 
/home/leontyev/programs/bin/gromacs/OpenMM2.0-Linux64/lib/plugins: 
libOpenMMCuda.so, libOpenMMOpenCL.so,
The combination rule of the used force field matches the one used by OpenMM.
Gromacs will use the OpenMM platform: Cuda
Gromacs will run on the GPU #0 (GeForce GTX 260).
Pre-simulation ~15s memtest in progress...
Memory test completed without errors.

Constraining the starting coordinates (step 0)

Constraining the coordinates at t0-dt (step 0)
Initial temperature: 0 K

Started mdrun on node 0 Fri Oct  8 16:54:04 2010

           Step           Time         Lambda
              0        0.00000        0.00000

   Energies (kJ/mol)
      Potential    Kinetic En.   Total Energy    Temperature   Constr. rmsd
   -5.34934e+05    1.22629e+05   -4.12305e+05    2.97883e+02    1.03777e-06

           Step           Time         Lambda
           1000        1.00000        0.00000

   Energies (kJ/mol)
      Potential    Kinetic En.   Total Energy    Temperature   Constr. rmsd
   -5.42963e+05    1.16609e+05   -4.26354e+05    2.83260e+02    1.03777e-06

           Step           Time         Lambda
           2000        2.00000        0.00000

   Energies (kJ/mol)
      Potential    Kinetic En.   Total Energy    Temperature   Constr. rmsd
   -5.49782e+05    1.14408e+05   -4.35374e+05    2.77912e+02    1.03777e-06

           Step           Time         Lambda
           3000        3.00000        0.00000

   Energies (kJ/mol)
      Potential    Kinetic En.   Total Energy    Temperature   Constr. rmsd
   -5.51337e+05    1.12705e+05   -4.38631e+05    2.73777e+02    1.03777e-06

           Step           Time         Lambda
           4000        4.00000        0.00000

   Energies (kJ/mol)
      Potential    Kinetic En.   Total Energy    Temperature   Constr. rmsd
   -5.52340e+05    1.12827e+05   -4.39513e+05    2.74073e+02    1.03777e-06

           Step           Time         Lambda
           5000        5.00000        0.00000

   Energies (kJ/mol)
      Potential    Kinetic En.   Total Energy    Temperature   Constr. rmsd
   -5.52599e+05    1.13543e+05   -4.39056e+05    2.75812e+02    1.03777e-06

           Step           Time         Lambda
           6000        6.00000        0.00000

   Energies (kJ/mol)
      Potential    Kinetic En.   Total Energy    Temperature   Constr. rmsd
   -5.52946e+05    1.14271e+05   -4.38675e+05    2.77580e+02    1.03777e-06

           Step           Time         Lambda
           7000        7.00000        0.00000

   Energies (kJ/mol)
      Potential    Kinetic En.   Total Energy    Temperature   Constr. rmsd
   -5.51992e+05    1.13521e+05   -4.38471e+05    2.75759e+02    1.03777e-06

           Step           Time         Lambda
           8000        8.00000        0.00000

   Energies (kJ/mol)
      Potential    Kinetic En.   Total Energy    Temperature   Constr. rmsd
   -5.52834e+05    1.14111e+05   -4.38723e+05    2.77192e+02    1.03777e-06

           Step           Time         Lambda
           9000        9.00000        0.00000

   Energies (kJ/mol)
      Potential    Kinetic En.   Total Energy    Temperature   Constr. rmsd
   -5.52806e+05    1.13783e+05   -4.39022e+05    2.76396e+02    1.03777e-06

           Step           Time         Lambda
          10000       10.00000        0.00000

   Energies (kJ/mol)
      Potential    Kinetic En.   Total Energy    Temperature   Constr. rmsd
   -5.53230e+05    1.12594e+05   -4.40636e+05    2.73506e+02    1.03777e-06

Writing checkpoint, step 10000 at Fri Oct  8 17:06:02 2010


 <======  ###############  ==>
 <====  A V E R A G E S  ====>
 <==  ###############  ======>

 Statistics over 11 steps using 11 frames

   Energies (kJ/mol)
      Potential    Kinetic En.   Total Energy    Temperature   Constr. rmsd
   -5.49797e+05    1.14636e+05   -4.35160e+05    2.78468e+02    0.00000e+00

          Box-X          Box-Y          Box-Z
    1.73572e+12    1.19301e-40    2.31720e+11

   Total Virial (kJ/mol)
    0.00000e+00    0.00000e+00    0.00000e+00
    0.00000e+00    0.00000e+00    0.00000e+00
    0.00000e+00    0.00000e+00    0.00000e+00

   Pressure (bar)
    0.00000e+00    0.00000e+00    0.00000e+00
    0.00000e+00    0.00000e+00    0.00000e+00
    0.00000e+00    0.00000e+00    0.00000e+00

   Total Dipole (D)
    0.00000e+00    0.00000e+00    0.00000e+00

  Epot (kJ/mol)        Coul-SR          LJ-SR        Coul-14          LJ-14
glu242side-glu242side    0.00000e+00    0.00000e+00    0.00000e+00 
0.00000e+00
glu242side-rest    0.00000e+00    0.00000e+00    0.00000e+00    0.00000e+00
      rest-rest    0.00000e+00    0.00000e+00    0.00000e+00    0.00000e+00

Post-simulation ~15s memtest in progress...
Memory test completed without errors.

 M E G A - F L O P S   A C C O U N T I N G

   RF=Reaction-Field  FE=Free Energy  SCFE=Soft-Core/Free Energy
   T=Tabulated        W3=SPC/TIP3p    W4=TIP4p (single or pairs)
   NF=No Forces

 Computing:                               M-Number         M-Flops  % Flops
-----------------------------------------------------------------------------
 Lincs                                    0.011934           0.716     8.0
 Lincs-Mat                                0.106200           0.425     4.7
 Constraint-V                             0.046449           0.372     4.2
 Settle                                   0.023010           7.432    83.1
-----------------------------------------------------------------------------
 Total                                                       8.945   100.0
-----------------------------------------------------------------------------


     R E A L   C Y C L E   A N D   T I M E   A C C O U N T I N G

 Computing:         Nodes     Number     G-Cycles    Seconds     %
-----------------------------------------------------------------------
 Write traj.            1         11        3.033        1.1     0.2
 Rest                   1                1978.521      716.9    99.8
-----------------------------------------------------------------------
 Total                  1                1981.554      718.0   100.0
-----------------------------------------------------------------------

 OpenMM run - timing based on wallclock.

               NODE (s)   Real (s)      (%)
       Time:    717.970    717.970    100.0
                       11:57
               (Mnbf/s)   (MFlops)   (ns/day)  (hour/ns)
Performance:      0.000      0.012      1.204     19.942
Finished mdrun on node 0 Fri Oct  8 17:06:02 2010
////////////////////////////////////////////////////////////////////////////////////////////////////

> Igor Leontyev wrote:
>
> Finally, I compiled and ran simulations with gpu version of gromacs-4.5.1.
> There were several issues:
>
> 1) Precompiled OpenMM2.0 libraries and headers must be downloaded (which
> requires registration on their web page) and installed, otherwise cmake
> doesn't find some source files.
>
> 2) cmake should be called outside the original source directory with the
> path of the directory as an argument.
>
> 3) To run the obtained mdrun-gpu binary the CUDA dev driver should be
> installed, otherwise the program does not find 'CUDA'. This step appeared 
> to
> be the most problematic for me. According to OpenMM manual the driver must
> be installed with turned off x-windows service which can be done by the
> command "init 3". In Ubuntu this command has no effect, while switching 
> the
> graphical interface off/on is done by
>
> "sudo service gdm stop/start"
>
> It turned out that in Ubuntu-10.04 the CUDA driver installation script 
> does
> not work properly even with turned off gdm. This issue and its solution is
> described at http://ubuntuforums.org/showthread.php?t=1467074
>
> Thank you for comments,
>
> Igor
>>
>>
>> Szilárd Páll wrote:
>> Dear Igor,
>>
>> Your output look _very_ weird, it seems as if CMake internal
>> variable(s) were not initialized, which I have no clue how could have
>> happened - the build generator works just fine for me. The only thing
>> I can think of is that maybe your CMakeCache is corrupted.
>>
>> Could you please rerun cmake in a _clean_ build directory? Also, are
>> you able to run cmake for CPU build (no -D options)?
>>
>> --
>> Szilárd
>>
>>> Szilárd wrote:
>>>>
>>>> The beta versions are all outdated, could you please use the latest
>>>> source distribution (4.5.1) instead (or git from the
>>>> release-4-5-patches branch)?
>>>
>>> The result is the same for both the distribution 4.5.1 and git from the
>>> release-4-5-patches. See the output bellow.
>>> =========================================
>>>
>>> PATH=/usr/local/opt/bin/mpi/openmpi-1.4.2/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games
>>> LD_LIBRARY_PATH=/usr/local/opt/bin/mpi/openmpi-1.4.2/lib:/home/leontyev/programs/bin/cuda/lib64:
>>> CPPFLAGS=-I//usr/local/opt/bin/gromacs/fftw-3.2.2/single_sse/include
>>> -I//usr/local/opt/bin/mpi/openmpi-1.4.2/include
>>> LDFLAGS=-L//usr/local/opt/bin/gromacs/fftw-3.2.2/single_sse/lib
>>> -L//usr/local/opt/bin/mpi/openmpi-1.4.2/lib
>>> OPENMM_ROOT_DIR=/home/leontyev/programs/bin/gromacs/gromacs-4.5.1-git/openmm
>>>
>>> cmake src -DGMX_OPENMM=ON -DGMX_THREADS=OFF
>>> -DCMAKE_INSTALL_PREFIX=/home/leontyev/programs/bin/gromacs/gromacs-4.5.1-git
>>> CMake Error at gmxlib/CMakeLists.txt:124 (set_target_properties):
>>> set_target_properties called with incorrect number of arguments.
>>>
>>>
>>> CMake Error at gmxlib/CMakeLists.txt:126 (install):
>>> install TARGETS given no ARCHIVE DESTINATION for static library target
>>> "gmx".
>>>
>>>
>>> CMake Error at mdlib/CMakeLists.txt:11 (set_target_properties):
>>> set_target_properties called with incorrect number of arguments.
>>>
>>>
>>> CMake Error at mdlib/CMakeLists.txt:13 (install):
>>> install TARGETS given no ARCHIVE DESTINATION for static library target
>>> "md".
>>>
>>>
>>> CMake Error at kernel/CMakeLists.txt:43 (set_target_properties):
>>> set_target_properties called with incorrect number of arguments.
>>>
>>>
>>> CMake Error at kernel/CMakeLists.txt:44 (set_target_properties):
>>> set_target_properties called with incorrect number of arguments.
>>>
>>>
>>> CMake Error at kernel/gmx_gpu_utils/CMakeLists.txt:18
>>> (CUDA_INCLUDE_DIRECTORIES):
>>> Unknown CMake command "CUDA_INCLUDE_DIRECTORIES".
>>>
>>>
>>> CMake Warning (dev) in CMakeLists.txt:
>>> No cmake_minimum_required command is present. A line of code such as
>>>
>>> cmake_minimum_required(VERSION 2.8)
>>>
>>> should be added at the top of the file. The version specified may be
>>> lower
>>> if you wish to support older CMake versions for this project. For more
>>> information run "cmake --help-policy CMP0000".
>>> This warning is for project developers. Use -Wno-dev to suppress it.
>>>
>>> -- Configuring incomplete, errors occurred! 





More information about the gromacs.org_gmx-users mailing list