[gmx-users] 答复: Can't allocate memory problem

Fri Jul 18 20:57:17 CEST 2014

On Fri, Jul 18, 2014 at 8:25 PM, Yunlong Liu <yliu120 at jhmi.edu> wrote:
> Hi Mark,
>
> I post up my log file for the run here. Thank you.
>
> Log file opened on Wed Jul 16 11:26:51 2014
> Host: c442-403.stampede.tacc.utexas.edu  pid: 31032  nodeid: 0  nnodes:  4
> GROMACS:    mdrun_mpi_gpu, VERSION 5.0-rc1
>
> GROMACS is written by:
> Emile Apol         Rossen Apostolov   Herman J.C. Berendsen Par Bjelkmar
> Aldert van Buuren  Rudi van Drunen    Anton Feenstra     Sebastian Fritsch
> Gerrit Groenhof    Christoph Junghans Peter Kasson       Carsten Kutzner
> Per Larsson        Justin A. Lemkul   Magnus Lundborg    Pieter Meulenhoff
> Erik Marklund      Teemu Murtola      Szilard Pall       Sander Pronk
> Roland Schulz      Alexey Shvetsov    Michael Shirts     Alfons Sijbers
> Peter Tieleman     Christian Wennberg Maarten Wolf
> and the project leaders:
> Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel
>
> Copyright (c) 1991-2000, University of Groningen, The Netherlands.
> Copyright (c) 2001-2014, The GROMACS development team at
> Uppsala University, Stockholm University and
> the Royal Institute of Technology, Sweden.
> check out http://www.gromacs.org for more information.
>
> GROMACS is free software; you can redistribute it and/or modify it
> under the terms of the GNU Lesser General Public License
> as published by the Free Software Foundation; either version 2.1
> of the License, or (at your option) any later version.
>
> GROMACS:      mdrun_mpi_gpu, VERSION 5.0-rc1
> Executable:   /work/03002/yliu120/gromacs-5.0/mv2_mkl/bin/mdrun_mpi_gpu
> Library dir:  /work/03002/yliu120/gromacs-5.0/mv2_mkl/share/gromacs/top
> Command line:
>   mdrun_mpi_gpu -pin on -ntomp 8 -deffnm pi3k-wt-1 -gpu_id 00
>
> Gromacs version:    VERSION 5.0-rc1
> Precision:          single
> Memory model:       64 bit
> MPI library:        MPI
> OpenMP support:     enabled
> GPU support:        enabled
> invsqrt routine:    gmx_software_invsqrt(x)
> SIMD instructions:  AVX_256
> FFT library:        Intel MKL
> RDTSCP usage:       enabled
> C++11 compilation:  disabled
> TNG support:        enabled
> Tracing support:    disabled
> Built on:           Wed Jun  4 13:59:17 CDT 2014
> Built by:           xzhu216 at login1.stampede.tacc.utexas.edu [CMAKE]
> Build OS/arch:      Linux 2.6.32-358.18.1.el6.x86_64 x86_64
> Build CPU vendor:   GenuineIntel
> Build CPU brand:    Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz
> Build CPU family:   6   Model: 45   Stepping: 7
> Build CPU features: aes apic avx clfsh cmov cx8 cx16 htt lahf_lm mmx msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdtscp sse2 sse3 sse4.1 sse4.2 ssse3 tdt x2apic
> C compiler:         /opt/apps/intel/13/composer_xe_2013.2.146/bin/intel64/icc Intel 13.1.0.20130121
> C compiler flags:    -mavx  -fno-strict-aliasing  -mkl=sequential -std=gnu99 -w3 -wd111 -wd177 -wd181 -wd193 -wd271 -wd304 -wd383 -wd424 -wd444 -wd522 -wd593 -wd869 -wd981 -wd1418 -wd1419 -wd1572 -wd1599 -wd2259 -wd2415 -wd2547 -wd2557 -wd3280 -wd3346   -ip -funroll-all-loops -alias-const -ansi-alias   -O3 -DNDEBUG
> C++ compiler:       /opt/apps/intel/13/composer_xe_2013.2.146/bin/intel64/icpc Intel 13.1.0.20130121
> C++ compiler flags:  -mavx  -fno-strict-aliasing  -w3 -wd111 -wd177 -wd181 -wd193 -wd271 -wd304 -wd383 -wd424 -wd444 -wd522 -wd593 -wd869 -wd981 -wd1418 -wd1419 -wd1572 -wd1599 -wd2259 -wd2415 -wd2547 -wd2557 -wd3280 -wd3346 -wd1782   -ip -funroll-all-loops -alias-const -ansi-alias   -O3 -DNDEBUG
> Boost version:      1.51.0 (external)
> CUDA compiler:      /opt/apps/cuda/5.0/bin/nvcc nvcc: NVIDIA (R) Cuda compiler driver;Copyright (c) 2005-2012 NVIDIA Corporation;Built on Fri_Sep_21_17:28:58_PDT_2012;Cuda compilation tools, release 5.0, V0.2.1221
> CUDA compiler flags:-gencode;arch=compute_20,code=sm_20;-gencode;arch=compute_20,code=sm_21;-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_35,code=compute_35;-use_fast_math;-ccbin=/opt/apps/intel/13/composer_xe_2013.2.146/bin/intel64/icc;;-Xcompiler;-gcc-version=450;; ;-mavx;-fno-strict-aliasing;-w3;-wd111;-wd177;-wd181;-wd193;-wd271;-wd304;-wd383;-wd424;-wd444;-wd522;-wd593;-wd869;-wd981;-wd1418;-wd1419;-wd1572;-wd1599;-wd2259;-wd2415;-wd2547;-wd2557;-wd3280;-wd3346;-wd1782;-ip;-funroll-all-loops;-alias-const;-ansi-alias;-O3;-DNDEBUG
> CUDA driver:        5.50
> CUDA runtime:       5.0

Tips: you will get better performance if you use: CUDA 5.5, gcc 4.8,
and fftw. The difference in total performance will depend on your
setup and could be anywhere between 0-15%.

>
>
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> B. Hess and C. Kutzner and D. van der Spoel and E. Lindahl
> GROMACS 4: Algorithms for highly efficient, load-balanced, and scalable
> molecular simulation
> J. Chem. Theory Comput. 4 (2008) pp. 435-447
> -------- -------- --- Thank You --- -------- --------
>
>
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> D. van der Spoel, E. Lindahl, B. Hess, G. Groenhof, A. E. Mark and H. J. C.
> Berendsen
> GROMACS: Fast, Flexible and Free
> J. Comp. Chem. 26 (2005) pp. 1701-1719
> -------- -------- --- Thank You --- -------- --------
>
>
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> E. Lindahl and B. Hess and D. van der Spoel
> GROMACS 3.0: A package for molecular simulation and trajectory analysis
> J. Mol. Mod. 7 (2001) pp. 306-317
> -------- -------- --- Thank You --- -------- --------
>
>
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> H. J. C. Berendsen, D. van der Spoel and R. van Drunen
> GROMACS: A message-passing parallel molecular dynamics implementation
> Comp. Phys. Comm. 91 (1995) pp. 43-56
> -------- -------- --- Thank You --- -------- --------
>
>
> Number of CPUs detected (16) does not match the number reported by OpenMP (1).
> Consider setting the launch configuration manually!

Something is still not right here. This message means that the OpenMP
library reports that there is *one* core available (through
omp_get_num_procs). Please consult your job scheduler's documentation
because this could affect performance (my guess is that it doesn't).

>
> For optimal performance with a GPU nstlist (now 5) should be larger.
> The optimum depends on your CPU and GPU resources.
> You might want to try several nstlist values.
> Changing nstlist from 5 to 40, rlist from 1 to 1.093
>
> Input Parameters:
>    integrator                     = md
>    nsteps               = 5000000
>    init-step            = 0
>    cutoff-scheme                  = Verlet
>    ns-type                        = Grid
>    nstlist              = 40
>    ndelta               = 2
>    nstcomm              = 100
>    comm-mode                      = Linear
>    nstlog               = 10000
>    nstxout              = 10000
>    nstvout              = 10000
>    nstfout              = 0
>    nstcalcenergy        = 100
>    nstenergy            = 10000
>    nstxout-compressed   = 10000
>    init-t               = 0
>    delta-t              = 0.002
>    x-compression-precision = 1000
>    fourierspacing       = 0.16
>    nkx                  = 96
>    nky                  = 96
>    nkz                  = 96
>    pme-order            = 4
>    ewald-rtol           = 1e-05
>    ewald-rtol-lj        = 0.001
>    ewald-geometry       = 0
>    epsilon-surface      = 0
>    optimize-fft                   = FALSE
>    lj-pme-comb-rule               = Geometric
>    ePBC                           = xyz
>    bPeriodicMols                  = FALSE
>    bContinuation                  = TRUE
>    bShakeSOR                      = FALSE
>    etc                            = V-rescale
>    bPrintNHChains                 = FALSE
>    nsttcouple           = 5
>    epc                            = Parrinello-Rahman
>    epctype                        = Isotropic
>    nstpcouple           = 5
>    tau-p                = 2
>    ref-p (3x3):
>       ref-p[    0]={ 1.00000e+00,  0.00000e+00,  0.00000e+00}
>       ref-p[    1]={ 0.00000e+00,  1.00000e+00,  0.00000e+00}
>       ref-p[    2]={ 0.00000e+00,  0.00000e+00,  1.00000e+00}
>    compress (3x3):
>       compress[    0]={ 4.50000e-05,  0.00000e+00,  0.00000e+00}
>       compress[    1]={ 0.00000e+00,  4.50000e-05,  0.00000e+00}
>       compress[    2]={ 0.00000e+00,  0.00000e+00,  4.50000e-05}
>    refcoord-scaling               = No
>    posres-com (3):
>       posres-com[0]= 0.00000e+00
>       posres-com[1]= 0.00000e+00
>       posres-com[2]= 0.00000e+00
>    posres-comB (3):
>       posres-comB[0]= 0.00000e+00
>       posres-comB[1]= 0.00000e+00
>       posres-comB[2]= 0.00000e+00
>    verlet-buffer-tolerance = 0.005
>    rlist                = 1.093
>    rlistlong            = 1.093
>    nstcalclr            = 5
>    rtpi                 = 0.05
>    coulombtype                    = PME
>    coulomb-modifier               = Potential-shift
>    rcoulomb-switch      = 0
>    rcoulomb             = 1
>    vdwtype                        = Cut-off
>    vdw-modifier                   = Potential-shift
>    rvdw-switch          = 0
>    rvdw                 = 1
>    epsilon-r            = 1
>    epsilon-rf                     = inf
>    tabext               = 1
>    implicit-solvent               = No
>    gb-algorithm                   = Still
>    gb-epsilon-solvent   = 80
>    nstgbradii           = 1
>    rgbradii             = 1
>    gb-saltconc          = 0
>    gb-obc-alpha         = 1
>    gb-obc-beta          = 0.8
>    gb-obc-gamma         = 4.85
>    gb-dielectric-offset = 0.009
>    sa-algorithm                   = Ace-approximation
>    sa-surface-tension   = 2.05016
>    DispCorr                       = EnerPres
>    bSimTemp                       = FALSE
>    free-energy                    = no
>    nwall                = 0
>    wall-type                      = 9-3
>    wall-atomtype[0]     = -1
>    wall-atomtype[1]     = -1
>    wall-density[0]      = 0
>    wall-density[1]      = 0
>    wall-ewald-zfac      = 3
>    pull                           = no
>    rotation                       = FALSE
>    interactiveMD                  = FALSE
>    disre                          = No
>    disre-weighting                = Conservative
>    disre-mixed                    = FALSE
>    dr-fc                = 1000
>    dr-tau               = 0
>    nstdisreout          = 100
>    orires-fc            = 0
>    orires-tau           = 0
>    nstorireout          = 100
>    dihre-fc             = 0
>    em-stepsize          = 0.01
>    em-tol               = 10
>    niter                = 20
>    fc-stepsize          = 0
>    nstcgsteep           = 1000
>    nbfgscorr            = 10
>    ConstAlg                       = Lincs
>    shake-tol            = 0.0001
>    lincs-order          = 4
>    lincs-warnangle      = 30
>    lincs-iter           = 1
>    bd-fric              = 0
>    ld-seed              = 645545913
>    cos-accel            = 0
>    deform (3x3):
>       deform[    0]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
>       deform[    1]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
>       deform[    2]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
>    adress                         = FALSE
>    userint1             = 0
>    userint2             = 0
>    userint3             = 0
>    userint4             = 0
>    userreal1            = 0
>    userreal2            = 0
>    userreal3            = 0
>    userreal4            = 0
> grpopts:
>    nrdf:     42998.7      429867
>    ref-t:         310         310
>    tau-t:         0.1         0.1
> anneal:          No          No
> ann-npoints:           0           0
>    acc:            0           0           0
>    nfreeze:           N           N           N
>    energygrp-flags[  0]: 0
>    efield-x:
>       n = 0
>    efield-xt:
>       n = 0
>    efield-y:
>       n = 0
>    efield-yt:
>       n = 0
>    efield-z:
>       n = 0
>    efield-zt:
>       n = 0
>    eSwapCoords                    = no
>    bQMMM                          = FALSE
>    QMconstraints        = 0
>    QMMMscheme           = 0
>    scalefactor          = 1
> qm-opts:
>    ngQM                 = 0
>
> Initializing Domain Decomposition on 4 nodes
> Dynamic load balancing: auto
> Will sort the charge groups at every domain (re)decomposition
> Initial maximum inter charge-group distances:
>     two-body bonded interactions: 0.429 nm, LJ-14, atoms 13175 13183
>   multi-body bonded interactions: 0.489 nm, CMAP Dih., atoms 18312 18321
> Minimum cell size due to bonded interactions: 0.537 nm
> Maximum distance for 5 constraints, at 120 deg. angles, all-trans: 0.819 nm
> Estimated maximum distance required for P-LINCS: 0.819 nm
> This distance will limit the DD cell size, you can override this with -rcon
> Using 0 separate PME nodes, as there are too few total
>  nodes for efficient splitting
> Scaling the initial minimum size with 1/0.8 (option -dds) = 1.25
> Optimizing the DD grid for 4 cells with a minimum initial size of 1.024 nm
> The maximum allowed number of cells is: X 11 Y 11 Z 10
> Domain decomposition grid 4 x 1 x 1, separate PME nodes 0
> PME domain decomposition: 4 x 1 x 1
> Domain decomposition nodeid 0, coordinates 0 0 0
>
> Using two step summing over 2 groups of on average 2.0 processes
>
> Using 4 MPI processes
> Using 8 OpenMP threads per MPI process

Try running with 4 ranks and 4 threads which avoids using
Hyperthreading and will probably improve performance.

> Detecting CPU SIMD instructions.
> Present hardware specification:
> Vendor: GenuineIntel
> Brand:  Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz
> Family:  6  Model: 45  Stepping:  7
> Features: aes apic avx clfsh cmov cx8 cx16 htt lahf_lm mmx msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdtscp sse2 sse3 sse4.1 sse4.2 ssse3 tdt x2apic
> SIMD instructions most likely to fit this hardware: AVX_256
> SIMD instructions selected at GROMACS compile time: AVX_256
>
>
> 1 GPU detected on host c442-403.stampede.tacc.utexas.edu:
>   #0: NVIDIA Tesla K20m, compute cap.: 3.5, ECC: yes, stat: compatible
>
> 1 GPU user-selected for this run.
> Mapping of GPUs to the 2 PP ranks in this node: #0, #0
>
> NOTE: You assigned a GPU to multiple MPI processes.
> Will do PME sum in reciprocal space for electrostatic interactions.
>
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> U. Essmann, L. Perera, M. L. Berkowitz, T. Darden, H. Lee and L. G. Pedersen
> A smooth particle mesh Ewald method
> J. Chem. Phys. 103 (1995) pp. 8577-8592
> -------- -------- --- Thank You --- -------- --------
>
> Will do ordinary reciprocal space Ewald sum.
> Using a Gaussian width (1/beta) of 0.320163 nm for Ewald
> Cut-off's:   NS: 1.093   Coulomb: 1   LJ: 1
> Long Range LJ corr.: <C6> 3.1875e-04
> System total charge: 1.000
> Generated table with 1046 data points for Ewald.
> Tabscale = 500 points/nm
> Generated table with 1046 data points for LJ6.
> Tabscale = 500 points/nm
> Generated table with 1046 data points for LJ12.
> Tabscale = 500 points/nm
> Generated table with 1046 data points for 1-4 COUL.
> Tabscale = 500 points/nm
> Generated table with 1046 data points for 1-4 LJ6.
> Tabscale = 500 points/nm
> Generated table with 1046 data points for 1-4 LJ12.
> Tabscale = 500 points/nm
>
> Using CUDA 8x8 non-bonded kernels
>
> Potential shift: LJ r^-12: -1.000e+00 r^-6: -1.000e+00, Ewald -1.000e-05
> Initialized non-bonded Ewald correction tables, spacing: 6.52e-04 size: 1536
>
>
> Overriding thread affinity set outside mdrun_mpi_gpu
>
> Pinning threads with an auto-selected logical core stride of 1
>
> Initializing Parallel LINear Constraint Solver
>
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> B. Hess
> P-LINCS: A Parallel Linear Constraint Solver for molecular simulation
> J. Chem. Theory Comput. 4 (2008) pp. 116-122
> -------- -------- --- Thank You --- -------- --------
>
> The number of constraints is 21852
> There are inter charge-group constraints,
> will communicate selected coordinates each lincs iteration
>
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> S. Miyamoto and P. A. Kollman
> SETTLE: An Analytical Version of the SHAKE and RATTLE Algorithms for Rigid
> Water Models
> J. Comp. Chem. 13 (1992) pp. 952-962
> -------- -------- --- Thank You --- -------- --------
>
>
> Linking all bonded interactions to atoms
> There are 333337 inter charge-group exclusions,
> will use an extra communication step for exclusion forces for PME
>
> The initial number of communication pulses is: X 1
> The initial domain decomposition cell size is: X 3.06 nm
>
> The maximum allowed distance for charge groups involved in interactions is:
>                  non-bonded interactions           1.093 nm
> (the following are initial values, they could change due to box deformation)
>             two-body bonded interactions  (-rdd)   1.093 nm
>           multi-body bonded interactions  (-rdd)   1.093 nm
>   atoms separated by up to 5 constraints  (-rcon)  3.061 nm
>
> When dynamic load balancing gets turned on, these settings will change to:
> The maximum number of communication pulses is: X 1
> The minimum size for domain decomposition cells is 1.093 nm
> The requested allowed shrink of DD cells (option -dds) is: 0.80
> The allowed shrink of domain decomposition cells is: X 0.36
> The maximum allowed distance for charge groups involved in interactions is:
>                  non-bonded interactions           1.093 nm
>             two-body bonded interactions  (-rdd)   1.093 nm
>           multi-body bonded interactions  (-rdd)   1.093 nm
>   atoms separated by up to 5 constraints  (-rcon)  1.093 nm
>
>
> Making 1D domain decomposition grid 4 x 1 x 1, home cell index 0 0 0
>
> Center of mass motion removal mode is Linear
> We have the following groups for center of mass motion removal:
>   0:  rest
>
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> G. Bussi, D. Donadio and M. Parrinello
> Canonical sampling through velocity rescaling
> J. Chem. Phys. 126 (2007) pp. 014101
> -------- -------- --- Thank You --- -------- --------
>
> There are: 236549 Atoms
> Charge group distribution at step 0: 58642 59637 59750 58520
> Initial temperature: 310.644 K
>
> Started mdrun on node 0 Wed Jul 16 11:26:55 2014
>            Step           Time         Lambda
>               0        0.00000        0.00000
>
>    Energies (kJ/mol)
>             U-B    Proper Dih.  Improper Dih.      CMAP Dih.          LJ-14
>     5.07365e+04    2.95121e+04    3.01332e+03   -7.32021e+03    1.97198e+04
>      Coulomb-14        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.
>     2.01566e+05    4.01053e+05   -3.13304e+04   -3.70280e+06    2.22526e+04
>       Potential    Kinetic En.   Total Energy    Temperature Pres. DC (bar)
>    -3.01359e+06    6.13614e+05   -2.39998e+06    3.12141e+02   -2.18173e+02
>  Pressure (bar)   Constr. rmsd
>    -3.47613e+01    3.40465e-05
>
> DD  step 39 load imb.: force 16.2%
>
> step   80: timed with pme grid 96 96 96, coulomb cutoff 1.000: 1156.6 M-cycles
> step  160: timed with pme grid 80 80 80, coulomb cutoff 1.172: 1547.8 M-cycles
> step  240: timed with pme grid 96 96 96, coulomb cutoff 1.000: 1151.5 M-cycles
> step  320: timed with pme grid 84 84 84, coulomb cutoff 1.116: 1385.3 M-cycles
>               optimal pme grid 96 96 96, coulomb cutoff 1.000
> DD  step 9999 load imb.: force 12.0%
>
>            Step           Time         Lambda
>           10000       20.00000        0.00000
>
>    Energies (kJ/mol)
>             U-B    Proper Dih.  Improper Dih.      CMAP Dih.          LJ-14
>     5.04795e+04    2.99195e+04    2.92288e+03   -7.24773e+03    2.00091e+04
>      Coulomb-14        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.
>     2.01757e+05    4.01852e+05   -3.13182e+04   -3.70354e+06    2.21148e+04
>       Potential    Kinetic En.   Total Energy    Temperature Pres. DC (bar)
>    -3.01305e+06    6.07693e+05   -2.40536e+06    3.09129e+02   -2.18004e+02
>  Pressure (bar)   Constr. rmsd
>     4.97661e+01    3.21278e-05
>
> DD  step 19999 load imb.: force 12.9%
>
>            Step           Time         Lambda
>           20000       40.00000        0.00000
>
>    Energies (kJ/mol)
>             U-B    Proper Dih.  Improper Dih.      CMAP Dih.          LJ-14
>     5.03101e+04    2.98583e+04    2.92319e+03   -7.24572e+03    1.98643e+04
>      Coulomb-14        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.
>     2.02386e+05    4.04043e+05   -3.13265e+04   -3.70922e+06    2.22318e+04
>       Potential    Kinetic En.   Total Energy    Temperature Pres. DC (bar)
>    -3.01618e+06    6.10725e+05   -2.40545e+06    3.10671e+02   -2.18119e+02
>  Pressure (bar)   Constr. rmsd
>     5.25345e+01    3.19849e-05
>
> DD  step 29999 load imb.: force 12.9%
>
>            Step           Time         Lambda
>           30000       60.00000        0.00000
>
>    Energies (kJ/mol)
>             U-B    Proper Dih.  Improper Dih.      CMAP Dih.          LJ-14
>     5.04086e+04    2.98208e+04    2.96232e+03   -7.36511e+03    1.97707e+04
>      Coulomb-14        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.
>     2.02219e+05    4.04588e+05   -3.13898e+04   -3.70933e+06    2.18899e+04
>       Potential    Kinetic En.   Total Energy    Temperature Pres. DC (bar)
>    -3.01643e+06    6.09712e+05   -2.40671e+06    3.10156e+02   -2.19002e+02
>  Pressure (bar)   Constr. rmsd
>     1.69563e+01    3.28362e-05
>
> DD  step 39999 load imb.: force 13.6%
>
>            Step           Time         Lambda
>           40000       80.00000        0.00000
>
>    Energies (kJ/mol)
>             U-B    Proper Dih.  Improper Dih.      CMAP Dih.          LJ-14
>     5.08454e+04    2.97168e+04    2.88905e+03   -7.38898e+03    1.97906e+04
>      Coulomb-14        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.
>     2.01724e+05    4.00249e+05   -3.13130e+04   -3.70497e+06    2.19180e+04
>       Potential    Kinetic En.   Total Energy    Temperature Pres. DC (bar)
>    -3.01654e+06    6.09973e+05   -2.40657e+06    3.10288e+02   -2.17931e+02
>  Pressure (bar)   Constr. rmsd
>    -4.75665e+01    3.26365e-05
>
> DD  step 49999 load imb.: force 15.1%
>
>            Step           Time         Lambda
>           50000      100.00000        0.00000
>
>    Energies (kJ/mol)
>             U-B    Proper Dih.  Improper Dih.      CMAP Dih.          LJ-14
>     5.05524e+04    2.96917e+04    2.93777e+03   -7.29600e+03    1.98992e+04
>      Coulomb-14        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.
>     2.01154e+05    4.02695e+05   -3.12880e+04   -3.70590e+06    2.18002e+04
>       Potential    Kinetic En.   Total Energy    Temperature Pres. DC (bar)
>    -3.01576e+06    6.07954e+05   -2.40780e+06    3.09262e+02   -2.17584e+02
>  Pressure (bar)   Constr. rmsd
>    -7.76501e+00    3.21618e-05
>
> DD  step 59999 load imb.: force 12.5%
>
>            Step           Time         Lambda
>           60000      120.00000        0.00000
>
>    Energies (kJ/mol)
>             U-B    Proper Dih.  Improper Dih.      CMAP Dih.          LJ-14
>     5.00781e+04    3.00110e+04    3.14274e+03   -7.37195e+03    2.00457e+04
>      Coulomb-14        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.
>     2.02606e+05    4.01699e+05   -3.13388e+04   -3.70509e+06    2.19611e+04
>       Potential    Kinetic En.   Total Energy    Temperature Pres. DC (bar)
>    -3.01426e+06    6.09684e+05   -2.40458e+06    3.10142e+02   -2.18291e+02
>  Pressure (bar)   Constr. rmsd
>    -3.63586e-01    3.19734e-05
>
> DD  step 69999 load imb.: force 11.2%
>
>            Step           Time         Lambda
>           70000      140.00000        0.00000
>
>    Energies (kJ/mol)
>             U-B    Proper Dih.  Improper Dih.      CMAP Dih.          LJ-14
>     5.08759e+04    2.99581e+04    2.98187e+03   -7.47015e+03    1.97981e+04
>      Coulomb-14        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.
>     2.02357e+05    4.01634e+05   -3.13895e+04   -3.70518e+06    2.19320e+04
>       Potential    Kinetic En.   Total Energy    Temperature Pres. DC (bar)
>    -3.01450e+06    6.10453e+05   -2.40404e+06    3.10533e+02   -2.18998e+02
>  Pressure (bar)   Constr. rmsd
>    -5.49562e+01    3.32050e-05
>
> DD  step 79999 load imb.: force 12.7%
>
>            Step           Time         Lambda
>           80000      160.00000        0.00000
>
>    Energies (kJ/mol)
>             U-B    Proper Dih.  Improper Dih.      CMAP Dih.          LJ-14
>     5.02476e+04    2.99203e+04    3.01794e+03   -7.41418e+03    1.99012e+04
>      Coulomb-14        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.
>     2.02636e+05    3.99866e+05   -3.13105e+04   -3.70666e+06    2.17894e+04
>       Potential    Kinetic En.   Total Energy    Temperature Pres. DC (bar)
>    -3.01801e+06    6.10188e+05   -2.40782e+06    3.10398e+02   -2.17897e+02
>  Pressure (bar)   Constr. rmsd
>    -7.82256e+01    3.22503e-05
>
> Writing checkpoint, step 84280 at Wed Jul 16 11:41:55 2014
>
>
> DD  step 89999 load imb.: force  9.5%
>
>            Step           Time         Lambda
>           90000      180.00000        0.00000
>
>    Energies (kJ/mol)
>             U-B    Proper Dih.  Improper Dih.      CMAP Dih.          LJ-14
>     5.04253e+04    3.01029e+04    3.04981e+03   -7.29947e+03    1.98989e+04
>      Coulomb-14        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.
>     2.02004e+05    4.03726e+05   -3.12806e+04   -3.70717e+06    2.20550e+04
>       Potential    Kinetic En.   Total Energy    Temperature Pres. DC (bar)
>    -3.01449e+06    6.08805e+05   -2.40568e+06    3.09694e+02   -2.17480e+02
>  Pressure (bar)   Constr. rmsd
>     2.29629e+01    3.23359e-05
>
> DD  step 99999 load imb.: force 11.4%
>
>            Step           Time         Lambda
>          100000      200.00000        0.00000
>
>    Energies (kJ/mol)
>             U-B    Proper Dih.  Improper Dih.      CMAP Dih.          LJ-14
>     5.05809e+04    2.97365e+04    2.90575e+03   -7.46760e+03    2.00142e+04
>      Coulomb-14        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.
>     2.02442e+05    4.02628e+05   -3.13276e+04   -3.70909e+06    2.19456e+04
>       Potential    Kinetic En.   Total Energy    Temperature Pres. DC (bar)
>    -3.01763e+06    6.09703e+05   -2.40793e+06    3.10151e+02   -2.18135e+02
>  Pressure (bar)   Constr. rmsd
>     2.61670e+01    3.23152e-05
>
> DD  step 109999 load imb.: force 11.5%
>
>            Step           Time         Lambda
>          110000      220.00000        0.00000
>
>    Energies (kJ/mol)
>             U-B    Proper Dih.  Improper Dih.      CMAP Dih.          LJ-14
>     5.04489e+04    2.98261e+04    2.96408e+03   -7.46597e+03    1.99103e+04
>      Coulomb-14        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.
>     2.01929e+05    4.04057e+05   -3.13158e+04   -3.70812e+06    2.21537e+04
>       Potential    Kinetic En.   Total Energy    Temperature Pres. DC (bar)
>    -3.01561e+06    6.09714e+05   -2.40590e+06    3.10157e+02   -2.17970e+02
>  Pressure (bar)   Constr. rmsd
>     3.75535e+01    3.23884e-05
>
> DD  step 119999 load imb.: force 13.4%
>
>            Step           Time         Lambda
>          120000      240.00000        0.00000
>
>    Energies (kJ/mol)
>             U-B    Proper Dih.  Improper Dih.      CMAP Dih.          LJ-14
>     5.02048e+04    2.96834e+04    2.99140e+03   -7.47253e+03    1.98509e+04
>      Coulomb-14        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.
>     2.02924e+05    4.00695e+05   -3.13677e+04   -3.70737e+06    2.19556e+04
>       Potential    Kinetic En.   Total Energy    Temperature Pres. DC (bar)
>    -3.01790e+06    6.09085e+05   -2.40882e+06    3.09837e+02   -2.18693e+02
>  Pressure (bar)   Constr. rmsd
>    -4.17847e+01    3.24539e-05
>
> DD  step 129999 load imb.: force 13.9%
>
>            Step           Time         Lambda
>          130000      260.00000        0.00000
>
>    Energies (kJ/mol)
>             U-B    Proper Dih.  Improper Dih.      CMAP Dih.          LJ-14
>     4.99271e+04    2.98272e+04    2.93917e+03   -7.27635e+03    1.98999e+04
>      Coulomb-14        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.
>     2.02518e+05    3.97726e+05   -3.13026e+04   -3.70217e+06    2.20807e+04
>       Potential    Kinetic En.   Total Energy    Temperature Pres. DC (bar)
>    -3.01583e+06    6.09694e+05   -2.40614e+06    3.10147e+02   -2.17788e+02
>  Pressure (bar)   Constr. rmsd
>    -1.15382e+02    3.19481e-05
>
> DD  step 139999 load imb.: force 10.1%
>
>            Step           Time         Lambda
>          140000      280.00000        0.00000
>
>    Energies (kJ/mol)
>             U-B    Proper Dih.  Improper Dih.      CMAP Dih.          LJ-14
>     5.03356e+04    2.97673e+04    2.87730e+03   -7.47602e+03    1.97827e+04
>      Coulomb-14        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.
>     2.02578e+05    4.02705e+05   -3.12475e+04   -3.70659e+06    2.19426e+04
>       Potential    Kinetic En.   Total Energy    Temperature Pres. DC (bar)
>    -3.01532e+06    6.06119e+05   -2.40920e+06    3.08328e+02   -2.17021e+02
>  Pressure (bar)   Constr. rmsd
>    -4.87389e+01    3.29943e-05
>
> DD  step 149999 load imb.: force 12.0%
>
>            Step           Time         Lambda
>          150000      300.00000        0.00000
>
>    Energies (kJ/mol)
>             U-B    Proper Dih.  Improper Dih.      CMAP Dih.          LJ-14
>     5.01586e+04    2.98972e+04    2.97697e+03   -7.38809e+03    1.99546e+04
>      Coulomb-14        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.
>     2.02136e+05    4.04987e+05   -3.13378e+04   -3.71164e+06    2.18617e+04
>       Potential    Kinetic En.   Total Energy    Temperature Pres. DC (bar)
>    -3.01840e+06    6.08594e+05   -2.40980e+06    3.09587e+02   -2.18277e+02
>  Pressure (bar)   Constr. rmsd
>     4.10102e+01    3.20743e-05
>
> DD  step 159999 load imb.: force 11.6%
>
>            Step           Time         Lambda
>          160000      320.00000        0.00000
>
>    Energies (kJ/mol)
>             U-B    Proper Dih.  Improper Dih.      CMAP Dih.          LJ-14
>     5.02983e+04    2.99181e+04    2.94672e+03   -7.49915e+03    1.99437e+04
>      Coulomb-14        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.
>     2.03281e+05    4.00190e+05   -3.12908e+04   -3.70493e+06    2.23782e+04
>       Potential    Kinetic En.   Total Energy    Temperature Pres. DC (bar)
>    -3.01477e+06    6.09016e+05   -2.40575e+06    3.09802e+02   -2.17623e+02
>  Pressure (bar)   Constr. rmsd
>    -6.39499e+01    3.24603e-05
>
> Writing checkpoint, step 168560 at Wed Jul 16 11:56:56 2014
>
>
> DD  step 169999 load imb.: force 12.3%
>
>            Step           Time         Lambda
>          170000      340.00000        0.00000
>
>    Energies (kJ/mol)
>             U-B    Proper Dih.  Improper Dih.      CMAP Dih.          LJ-14
>     4.96860e+04    2.97866e+04    2.87484e+03   -7.42084e+03    1.98346e+04
>      Coulomb-14        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.
>     2.03200e+05    4.05459e+05   -3.13545e+04   -3.71085e+06    2.20106e+04
>       Potential    Kinetic En.   Total Energy    Temperature Pres. DC (bar)
>    -3.01677e+06    6.08386e+05   -2.40839e+06    3.09481e+02   -2.18509e+02
>  Pressure (bar)   Constr. rmsd
>     7.35129e+01    3.22101e-05
>
> DD  step 179999 load imb.: force 12.5%
>probably
>            Step           Time         Lambda
>          180000      360.00000        0.00000
>
>    Energies (kJ/mol)
>             U-B    Proper Dih.  Improper Dih.      CMAP Dih.          LJ-14
>     5.01938e+04    2.98005e+04    3.12308e+03   -7.44680e+03    1.97367e+04
>      Coulomb-14        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.
>     2.02109e+05    4.02954e+05   -3.12995e+04   -3.70869e+06    2.20544e+04
>       Potential    Kinetic En.   Total Energy    Temperature Pres. DC (bar)
>    -3.01747e+06    6.08815e+05   -2.40865e+06    3.09700e+02   -2.17744e+02
>  Pressure (bar)   Constr. rmsd
>    -1.15302e+01    3.22109e-05
>
> DD  step 189999 load imb.: force 13.4%
>
>            Step           Time         Lambda
>          190000      380.00000        0.00000
>
>    Energies (kJ/mol)
>             U-B    Proper Dih.  Improper Dih.      CMAP Dih.          LJ-14
>     5.07811e+04    2.99541e+04    2.98628e+03   -7.39091e+03    1.97456e+04
>      Coulomb-14        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.
>     2.01761e+05    4.03922e+05   -3.13225e+04   -3.71005e+06    2.19542e+04
>       Potential    Kinetic En.   Total Energy    Temperature Pres. DC (bar)
>    -3.01766e+06    6.10133e+05   -2.40753e+06    3.10370e+02   -2.18064e+02
>  Pressure (bar)   Constr. rmsd
>    -2.96181e+01    3.28160e-05
>
> If you want to see the full log file, please give me an email address that I could send it to.

Pastebin?

Cheers,
Sz.

> Thank you.
> Yunlong
>
> ________________________________________
> 发件人: gromacs.org_gmx-users-bounces at maillist.sys.kth.se <gromacs.org_gmx-users-bounces at maillist.sys.kth.se> 代表 Mark Abraham <mark.j.abraham at gmail.com>
> 发送时间: 2014年7月18日 23:52
> 收件人: Discussion list for GROMACS users
> 主题: Re: [gmx-users] Can't allocate memory problem
>
> Hi,
>
> That's highly unusual, and suggests you are doing something highly unusual,
> like trying to run on huge numbers of threads, or very large numbers of
> bonded interactions. How are you setting up to call mdrun, and what is in
> your tpr?
>
> Mark
> On Jul 17, 2014 10:13 PM, "Yunlong Liu" <yliu120 at jhmi.edu> wrote:
>
>> Hi,
>>
>>
>> I am currently experiencing a "Can't allocate memory" problem on Gromacs
>> 4.6.5 with GPU acceleration.
>>
>> Actually, I am running my simulations on Stampede/TACC supercomputers with
>> their GPU queue. My first experience is when the simulation length longer
>> than 10 ns, the system starts to throw out the "Can't allocate memory"
>> problem as follows:
>>
>>
>> Fatal error:
>> Not enough memory. Failed to realloc 1403808 bytes for f_t->f,
>> f_t->f=0xa912a010
>> (called from file
>> /admin/build/admin/rpms/stampede/BUILD/gromacs-4.6.5/src/gmxlib/bondfree.c,
>> line 3840)
>> For more information and tips for troubleshooting, please check the GROMACS
>> website at http://www.gromacs.org/Documentation/Errors
>> -------------------------------------------------------
>>
>> "These Gromacs Guys Really Rock" (P.J. Meulenhoff)
>> : Cannot allocate memory
>> Error on node 0, will try to stop all the nodes
>> Halting parallel program mdrun_mpi_gpu on CPU 0 out of 4
>>
>> -------------------------------------------------------
>> Program mdrun_mpi_gpu, VERSION 4.6.5
>> Source code file:
>> /admin/build/admin/rpms/stampede/BUILD/gromacs-4.6.5/src/gmxlib/smalloc.c,
>> line: 241
>>
>> Fatal error:
>> Not enough memory. Failed to realloc 1403808 bytes for f_t->f,
>> f_t->f=0xaa516e90
>> (called from file
>> /admin/build/admin/rpms/stampede/BUILD/gromacs-4.6.5/src/gmxlib/bondfree.c,
>> line 3840)
>> For more information and tips for troubleshooting, please check the GROMACS
>> website at http://www.gromacs.org/Documentation/Errors
>> -------------------------------------------------------
>>
>> Recently, this error occurs even I run a short NVT equilibrium. This
>> problem also exists when I use Gromacs 5.0 with GPU acceleration. I looked
>> up the Gromacs errors website to check the reasons for this. But it seems
>> that none of those reasons will fit in this situation. I use a very good
>> computer, the Stampede and I run short simulations. And I know gromacs use
>> nanometers as unit. I tried all the solutions that I can figure out but the
>> problem becomes more severe.
>>
>> Is there anybody that has an idea on solving this issue?
>>
>> Thank you.
>>
>> Yunlong
>>
>>
>>
>>
>>
>>
>>
>>
>> Davis Yunlong Liu
>>
>> BCMB - Second Year PhD Candidate
>>
>> School of Medicine
>>
>> The Johns Hopkins University
>>
>> E-mail: yliu120 at jhmi.edu<mailto:yliu120 at jhmi.edu>
>> --
>> Gromacs Users mailing list
>>
>> * Please search the archive at
>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>> posting!
>>
>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>
>> * For (un)subscribe requests visit
>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> send a mail to gmx-users-request at gromacs.org.
>>
> --
> Gromacs Users mailing list
>
> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.
> --
> Gromacs Users mailing list
>
> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.