[gmx-users] cudaFuncGetAttributes failed: out of memory
bonjour899
bonjour899 at 126.com
Sun Feb 23 04:33:14 CET 2020
I also tried to restricting to different GPU using -gpu_id, but still with the same error. I've also posting my question on https://devtalk.nvidia.com/default/topic/1072038/cuda-programming-and-performance/cudafuncgetattributes-failed-out-of-memory/
Following is the output of nvidia-smi:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.33.01 Driver Version: 440.33.01 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla P100-PCIE... On | 00000000:04:00.0 Off | 0 |
| N/A 35C P0 34W / 250W | 16008MiB / 16280MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla P100-PCIE... On | 00000000:06:00.0 Off | 0 |
| N/A 35C P0 28W / 250W | 10MiB / 16280MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 Tesla P100-PCIE... On | 00000000:07:00.0 Off | 0 |
| N/A 35C P0 33W / 250W | 16063MiB / 16280MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 Tesla P100-PCIE... On | 00000000:08:00.0 Off | 0 |
| N/A 36C P0 29W / 250W | 10MiB / 16280MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 4 Quadro P4000 On | 00000000:0B:00.0 Off | N/A |
| 46% 27C P8 8W / 105W | 12MiB / 8119MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 20497 C /usr/bin/python3 5861MiB |
| 0 24503 C /usr/bin/python3 10137MiB |
| 2 23162 C /home/appuser/Miniconda3/bin/python 16049MiB |
+-----------------------------------------------------------------------------+
-------- Forwarding messages --------
From: "bonjour899" <bonjour899 at 126.com>
Date: 2020-02-20 10:30:36
To: "gromacs.org_gmx-users at maillist.sys.kth.se" <gromacs.org_gmx-users at maillist.sys.kth.se>
Subject: cudaFuncGetAttributes failed: out of memory
Hello,
I have encountered a weird problem. I've been using GROMACS with GPU on a server and always performance good. However when I just reran a job today and suddenly got this error:
Command line:
gmx mdrun -deffnm pull -ntmpi 1 -nb gpu -pme gpu -gpu_id 3
Back Off! I just backed up pull.log to ./#pull.log.1#
-------------------------------------------------------
Program: gmx mdrun, version 2019.4
Source file: src/gromacs/gpu_utils/gpu_utils.cu (line 100)
Fatal error:
cudaFuncGetAttributes failed: out of memory
For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors
-------------------------------------------------------
It seems the GPU is 0 occupied and I can run other apps with GPU, but I cannot run GROMACS mdrun anymore, even if doing energy minimization.
More information about the gromacs.org_gmx-users
mailing list