[gmx-users] CUDA fails to allocate memory to mdrun
Akshay Kumar Ganguly
akshayganguly at yahoo.co.in
Mon Mar 30 12:37:41 CEST 2015
Hi!
I'm running gromacs v5.0.4 on a 32-bit CentOS 6.5 OS with CUDA toolkit v4.0 and the latest nvidia linux driver (v340.24) for a Quadro K4200.
My explicit solvent runs crash during pme grid optimization citing the following error -
-------- -------- ----------------------------------------------------------------------------- -------- --------
There are: 75766 Atoms
Initial temperature: 301.154 K
Started mdrun on rank 0 Mon Mar 30 14:30:54 2015
Step Time Lambda
0 0.00000 0.00000
Energies (kJ/mol)
Angle Proper Dih. Improper Dih. LJ-14 Coulomb-14
6.92142e+03 9.29820e+03 3.90420e+02 3.61267e+03 4.01187e+04
LJ (SR) Disper. corr. Coulomb (SR) Coul. recip. Position Rest.
2.14594e+05 -9.79483e+03 -1.47636e+06 6.97759e+03 2.17547e-01
Potential Kinetic En. Total Energy Temperature Pres. DC (bar)
-1.20424e+06 1.89766e+05 -1.01448e+06 3.01035e+02 -2.11354e+02
Pressure (bar) Constr. rmsd
-4.75277e+02 2.75540e-05
step 80: timed with pme grid 60 60 60, coulomb cutoff 1.000: 3398.8 M-cycles
Step Time Lambda
100 0.20000 0.00000
Energies (kJ/mol)
Angle Proper Dih. Improper Dih. LJ-14 Coulomb-14
7.24215e+03 9.27752e+03 4.13476e+02 3.61128e+03 4.00927e+04
LJ (SR) Disper. corr. Coulomb (SR) Coul. recip. Position Rest.
2.17175e+05 -9.83595e+03 -1.47626e+06 4.92524e+03 8.51347e+02
Potential Kinetic En. Total Energy Temperature Pres. DC (bar)
-1.20251e+06 1.87791e+05 -1.01472e+06 2.97902e+02 -2.13130e+02
Pressure (bar) Constr. rmsd
-2.43339e+02 2.65826e-05
step 160: timed with pme grid 52 52 52, coulomb cutoff 1.102: 3359.8 M-cycles
-------------------------------------------------------
Program mdrun, VERSION 5.0.4
Source code file: /nmr/gromacs-5.0.4/src/gromacs/gmxlib/cuda_tools/pmalloc_cuda.cu, line: 61
Fatal error:
cudaMallocHost of size 2851616 bytes failed: out of memory
For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors
------------------------------------------------------------------------------------------------------------------------------------------------
The size of the failed allocation varies from error to error but it generally ranges from 2.5 to 5 mB.
While it is likely that the newer toolkits(v5-7) have resolved this issue, I am unable to use them since they lack support for 32-bit OSs.
I would be obliged if you could suggest a way to free up some page-linked memory (at least that's what google says the problem is but I'm still all at sea as far as the solution goes) for mdrun and/or provide a patch for pmalloc_cuda.cu.
The problem looks to be limited to complex systems [(large proteins (>150 residues) in explicit water (~25000 molecules) cubes] as I am able to run smaller ones (~100 residues) with ease in explicit solvent using the GPU for non-bonded calculations (mdrun -v -deffnm $prot_md -nb gpu).
I have attached herewith the corresponding log file and mdp.
Regards,
Akshay Kumar Ganguly
Graduate Student,Dr. Neel S. Bhavesh lab,
Structural and Computational Biology Group,
International Centre for Genetic Engineering and Biotechnology,
New Delhi - 67
More information about the gromacs.org_gmx-users
mailing list