[gmx-users] segmentation fault from power6 kernel

Mark Abraham Mark.Abraham at anu.edu.au
Thu Nov 3 02:48:19 CET 2011


On 3/11/2011 6:42 AM, Fabio AFFINITO wrote:
> Dear all,
> I've trying to run a simulation on a IBM Power6 cluster. At the
> beginning of the simulation I've got a segmentation fault. I investigated with TotalView and I've found that this segmentation violation originates in the pwr6kernel310.F
> Up to now, I still didn't find what is behind this seg violation. I would like to ask if anybody is aware of a bug behind this function.
> The simulation is obtained by using Gromacs 4.5.3 compiled in double precision.
> The options that I specified in the configure are:
> --disable-threads --enable-power6 --enable-mpi
>
> The log file doesn't provide much informations:
>
> Log file opened on Wed Nov  2 20:11:02 2011
> Host: sp0202  pid: 11796682  nodeid: 0  nnodes:  1
> The Gromacs distribution was built Thu Dec 16 14:44:40 GMT+01:00 2010 by
> propro01 at sp0201 (AIX 1 00C3E6444C00)
>
>
>                           :-)  G  R  O  M  A  C  S  (-:
>
>                 Gromacs Runs One Microsecond At Cannonball Speeds
>
>                              :-)  VERSION 4.5.3  (-:
>
>          Written by Emile Apol, Rossen Apostolov, Herman J.C. Berendsen,
>        Aldert van Buuren, Pär Bjelkmar, Rudi van Drunen, Anton Feenstra,
>          Gerrit Groenhof, Peter Kasson, Per Larsson, Pieter Meulenhoff,
>             Teemu Murtola, Szilard Pall, Sander Pronk, Roland Schulz,
>                  Michael Shirts, Alfons Sijbers, Peter Tieleman,
>
>                 Berk Hess, David van der Spoel, and Erik Lindahl.
>
>         Copyright (c) 1991-2000, University of Groningen, The Netherlands.
>              Copyright (c) 2001-2010, The GROMACS development team at
>          Uppsala University&  The Royal Institute of Technology, Sweden.
>              check out http://www.gromacs.org for more information.
>
>           This program is free software; you can redistribute it and/or
>            modify it under the terms of the GNU General Public License
>           as published by the Free Software Foundation; either version 2
>               of the License, or (at your option) any later version.
>
>                        :-)  mdrun_d (double precision)  (-:
>
>
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> B. Hess and C. Kutzner and D. van der Spoel and E. Lindahl
> GROMACS 4: Algorithms for highly efficient, load-balanced, and scalable
> molecular simulation
> J. Chem. Theory Comput. 4 (2008) pp. 435-447
> -------- -------- --- Thank You --- -------- --------
>
>
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> D. van der Spoel, E. Lindahl, B. Hess, G. Groenhof, A. E. Mark and H. J. C.
> Berendsen
> GROMACS: Fast, Flexible and Free
> J. Comp. Chem. 26 (2005) pp. 1701-1719
> -------- -------- --- Thank You --- -------- --------
>
>
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> E. Lindahl and B. Hess and D. van der Spoel
> GROMACS 3.0: A package for molecular simulation and trajectory analysis
> J. Mol. Mod. 7 (2001) pp. 306-317
> -------- -------- --- Thank You --- -------- --------
>
>
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> H. J. C. Berendsen, D. van der Spoel and R. van Drunen
> GROMACS: A message-passing parallel molecular dynamics implementation
> Comp. Phys. Comm. 91 (1995) pp. 43-56
> -------- -------- --- Thank You --- -------- --------
>
> Input Parameters:
>     integrator           = md
>     nsteps               = 2500000
>     init_step            = 0
>     ns_type              = Grid
>     nstlist              = 10
>     ndelta               = 2
>     nstcomm              = 10
>     comm_mode            = Linear
>     nstlog               = 2500
>     nstxout              = 2500
>     nstvout              = 2500
>     nstfout              = 0
>     nstcalcenergy        = 10
>     nstenergy            = 2500
>     nstxtcout            = 2500
>     init_t               = 0
>     delta_t              = 0.002
>     xtcprec              = 1000
>     nkx                  = 50
>     nky                  = 50
>     nkz                  = 50
>     pme_order            = 4
>     ewald_rtol           = 1e-05
>     ewald_geometry       = 0
>     epsilon_surface      = 0
>     optimize_fft         = TRUE
>     ePBC                 = xyz
>     bPeriodicMols        = FALSE
>     bContinuation        = FALSE
>     bShakeSOR            = FALSE
>     etc                  = Nose-Hoover
>     nsttcouple           = 10
>     epc                  = No
>     epctype              = Isotropic
>     nstpcouple           = -1
>     tau_p                = 1
>     ref_p (3x3):
>        ref_p[    0]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
>        ref_p[    1]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
>        ref_p[    2]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
>     compress (3x3):
>        compress[    0]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
>        compress[    1]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
>        compress[    2]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
>     refcoord_scaling     = No
>     posres_com (3):
>        posres_com[0]= 0.00000e+00
>        posres_com[1]= 0.00000e+00
>        posres_com[2]= 0.00000e+00
>     posres_comB (3):
>        posres_comB[0]= 0.00000e+00
>        posres_comB[1]= 0.00000e+00
>        posres_comB[2]= 0.00000e+00
>     andersen_seed        = 815131
>     rlist                = 0.9
>     rlistlong            = 0.9
>     rtpi                 = 0.05
>     coulombtype          = PME
>     rcoulomb_switch      = 0
>     rcoulomb             = 0.9
>     vdwtype              = Cut-off
>     rvdw_switch          = 0
>     rvdw                 = 0.9
>     epsilon_r            = 1
>     epsilon_rf           = 1
>     tabext               = 1
>     implicit_solvent     = No
>     gb_algorithm         = Still
>     gb_epsilon_solvent   = 80
>     nstgbradii           = 1
>     rgbradii             = 1
>     gb_saltconc          = 0
>     gb_obc_alpha         = 1
>     gb_obc_beta          = 0.8
>     gb_obc_gamma         = 4.85
>     gb_dielectric_offset = 0.009
>     sa_algorithm         = Ace-approximation
>     sa_surface_tension   = 2.05016
>     DispCorr             = No
>     free_energy          = no
>     init_lambda          = 0
>     delta_lambda         = 0
>     n_foreign_lambda     = 0
>     sc_alpha             = 0
>     sc_power             = 0
>     sc_sigma             = 0.3
>     sc_sigma_min         = 0.3
>     nstdhdl              = 10
>     separate_dhdl_file   = yes
>     dhdl_derivatives     = yes
>     dh_hist_size         = 0
>     dh_hist_spacing      = 0.1
>     nwall                = 0
>     wall_type            = 9-3
>     wall_atomtype[0]     = -1
>     wall_atomtype[1]     = -1
>     wall_density[0]      = 0
>     wall_density[1]      = 0
>     wall_ewald_zfac      = 3
>     pull                 = no
>     disre                = No
>     disre_weighting      = Conservative
>     disre_mixed          = FALSE
>     dr_fc                = 1000
>     dr_tau               = 0
>     nstdisreout          = 100
>     orires_fc            = 0
>     orires_tau           = 0
>     nstorireout          = 100
>     dihre-fc             = 1000
>     em_stepsize          = 0.01
>     em_tol               = 10
>     niter                = 20
>     fc_stepsize          = 0
>     nstcgsteep           = 1000
>     nbfgscorr            = 10
>     ConstAlg             = Lincs
>     shake_tol            = 0.0001
>     lincs_order          = 4
>     lincs_warnangle      = 30
>     lincs_iter           = 1
>     bd_fric              = 0
>     ld_seed              = 1993
>     cos_accel            = 0
>     deform (3x3):
>        deform[    0]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
>        deform[    1]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
>        deform[    2]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
>     userint1             = 0
>     userint2             = 0
>     userint3             = 0
>     userint4             = 0
>     userreal1            = 0
>     userreal2            = 0
>     userreal3            = 0
>     userreal4            = 0
> grpopts:
>     nrdf:       38427
>     ref_t:         350
>     tau_t:           1
> anneal:          No
> ann_npoints:           0
>     acc:	           0           0           0
>     nfreeze:           N           N           N
>     energygrp_flags[  0]: 0
>     efield-x:
>        n = 0
>     efield-xt:
>        n = 0
>     efield-y:
>        n = 0
>     efield-yt:
>        n = 0
>     efield-z:
>        n = 0
>     efield-zt:
>        n = 0
>     bQMMM                = FALSE
>     QMconstraints        = 0
>     QMMMscheme           = 0
>     scalefactor          = 1
> qm_opts:
>     ngQM                 = 0
> Table routines are used for coulomb: TRUE
> Table routines are used for vdw:     FALSE
> Will do PME sum in reciprocal space.
>
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> U. Essman, L. Perela, M. L. Berkowitz, T. Darden, H. Lee and L. G. Pedersen
> A smooth particle mesh Ewald method
> J. Chem. Phys. 103 (1995) pp. 8577-8592
> -------- -------- --- Thank You --- -------- --------
>
> Will do ordinary reciprocal space Ewald sum.
> Using a Gaussian width (1/beta) of 0.288146 nm for Ewald
> Cut-off's:   NS: 0.9   Coulomb: 0.9   LJ: 0.9
> System total charge: 0.000
> Generated table with 3800 data points for Ewald.
> Tabscale = 2000 points/nm
> Generated table with 3800 data points for LJ6.
> Tabscale = 2000 points/nm
> Generated table with 3800 data points for LJ12.
> Tabscale = 2000 points/nm
> Generated table with 3800 data points for 1-4 COUL.
> Tabscale = 2000 points/nm
> Generated table with 3800 data points for 1-4 LJ6.
> Tabscale = 2000 points/nm
> Generated table with 3800 data points for 1-4 LJ12.
> Tabscale = 2000 points/nm
>
> Enabling SPC-like water optimization for 5856 molecules.
>
> Configuring nonbonded kernels...
> Configuring standard C nonbonded kernels...
> Configuring double precision Fortran kernels...
> Configuring double precision IBM Power6-specific Fortran kernels...
>
>
> Removing pbc first time
>
> Initializing LINear Constraint Solver
>
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> B. Hess and H. Bekker and H. J. C. Berendsen and J. G. E. M. Fraaije
> LINCS: A Linear Constraint Solver for molecular simulations
> J. Comp. Chem. 18 (1997) pp. 1463-1472
> -------- -------- --- Thank You --- -------- --------
>
> The number of constraints is 1620
>
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> S. Miyamoto and P. A. Kollman
> SETTLE: An Analytical Version of the SHAKE and RATTLE Algorithms for Rigid
> Water Models
> J. Comp. Chem. 13 (1992) pp. 952-962
> -------- -------- --- Thank You --- -------- --------
>
>
> And the standard output ends with:
>
> starting mdrun 'Protein in water'
> 2500000 steps,   5000.0 ps.
> ERROR: 0031-250  task 0: Segmentation fault
>
> Do you have any idea or do you know if some similar bug has been reported?
>

The most likely issue is some normal "blowing up" scenario leading to a 
table-lookup-overrun segfault in the 3xx series kernels. I don't know 
why the usual error messages in such scenarios did not arise on this 
platform. Try setting the environment variable GMX_NOOPTIMIZEDKERNELS to 
1 to see if this is a power6-specific kernel issue. Try running the .tpr 
on another platform.

Mark



More information about the gromacs.org_gmx-users mailing list