[gmx-users] CUDA fails to allocate memory to mdrun

Akshay Kumar Ganguly akshayganguly at yahoo.co.in
Mon Mar 30 12:37:41 CEST 2015


 

Hi!
I'm running gromacs v5.0.4 on a 32-bit CentOS 6.5 OS with CUDA toolkit v4.0 and the latest nvidia linux driver (v340.24) for a Quadro K4200.

My explicit solvent runs crash during pme grid optimization citing the following error - 

-------- -------- ----------------------------------------------------------------------------- -------- --------

There are: 75766 Atoms
Initial temperature: 301.154 K

Started mdrun on rank 0 Mon Mar 30 14:30:54 2015
           Step           Time         Lambda
              0        0.00000        0.00000

   Energies (kJ/mol)
          Angle    Proper Dih.  Improper Dih.          LJ-14     Coulomb-14
    6.92142e+03    9.29820e+03    3.90420e+02    3.61267e+03    4.01187e+04
        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip. Position Rest.
    2.14594e+05   -9.79483e+03   -1.47636e+06    6.97759e+03    2.17547e-01
      Potential    Kinetic En.   Total Energy    Temperature Pres. DC (bar)
   -1.20424e+06    1.89766e+05   -1.01448e+06    3.01035e+02   -2.11354e+02
 Pressure (bar)   Constr. rmsd
   -4.75277e+02    2.75540e-05

step   80: timed with pme grid 60 60 60, coulomb cutoff 1.000: 3398.8 M-cycles
           Step           Time         Lambda
            100        0.20000        0.00000

   Energies (kJ/mol)
          Angle    Proper Dih.  Improper Dih.          LJ-14     Coulomb-14
    7.24215e+03    9.27752e+03    4.13476e+02    3.61128e+03    4.00927e+04
        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip. Position Rest.
    2.17175e+05   -9.83595e+03   -1.47626e+06    4.92524e+03    8.51347e+02
      Potential    Kinetic En.   Total Energy    Temperature Pres. DC (bar)
   -1.20251e+06    1.87791e+05   -1.01472e+06    2.97902e+02   -2.13130e+02
 Pressure (bar)   Constr. rmsd
   -2.43339e+02    2.65826e-05

step  160: timed with pme grid 52 52 52, coulomb cutoff 1.102: 3359.8 M-cycles

-------------------------------------------------------
Program mdrun, VERSION 5.0.4
Source code file: /nmr/gromacs-5.0.4/src/gromacs/gmxlib/cuda_tools/pmalloc_cuda.cu, line: 61

Fatal error:
cudaMallocHost of size 2851616 bytes failed: out of memory

For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors
------------------------------------------------------------------------------------------------------------------------------------------------

The size of the failed allocation varies from error to error but it generally ranges from 2.5 to 5 mB.
While it is likely that the newer toolkits(v5-7) have resolved this issue, I am unable to use them since they lack support for 32-bit OSs. 

I would be obliged if you could suggest a way to free up some page-linked memory (at least that's what google says the problem is but I'm still all at sea as far as the solution goes) for mdrun and/or provide a patch for pmalloc_cuda.cu. 
 The problem looks to be limited to complex systems [(large proteins (>150 residues) in explicit water (~25000 molecules) cubes] as I am able to run smaller ones (~100 residues) with ease in explicit solvent using the GPU for non-bonded calculations (mdrun -v -deffnm $prot_md -nb gpu).

I have attached herewith the corresponding log file and mdp.

Regards,
 
 Akshay Kumar Ganguly

Graduate Student,Dr. Neel S. Bhavesh lab,
Structural and Computational Biology Group,
International Centre for Genetic Engineering and Biotechnology,
New Delhi - 67


  


More information about the gromacs.org_gmx-users mailing list