[gmx-users] GPU issue - system that was stable on CPUs crashes
Hannah Baumann
hbaumann at uci.edu
Thu Jun 13 18:07:39 CEST 2019
Hi,
Thanks for your suggestions!
I looked at the box volume over time in the CPU-only NPT run, and the
volume doesn’t change a lot:
0.000000 523.472656 nm^3
1.000000 520.086731
2.000000 520.337769
3.000000 518.600708
...
100.000000 521.340820
I used Parrinello-Rahmen as barostat (on CPU and GPU).
However, when I change to Berendsen the system does run on GPUs, I first
get warnings (Step 20 Warning: pressure scaling more than 1%, mu: 0.984963
0.984963 0.984963) and after 4 ps it runs stable.
Here the volume drops initially and then equilibrates around the starting
volume after ~4ps:
0.000000 523.472656
1.000000 471.946625
2.000000 506.520508
3.000000 518.643494
4.000000 522.728210
...
100.000000 521.103638
Trying to continue with the production run using Parrinello-Rahman leads to
the same crash due to LINCS warnings.
Using Berendsen barostat the system does not crash, but there is also an
initial decrease in volume:
0.000000 521.103638
2.000000 501.624176
4.000000 516.751648
6.000000 520.300598
Do you have an idea why the volume first decreases and once equilibrated is
close to the starting volume again?
Do you think it would help changing the box size in this case?
Best,
Hannah
On Jun 12, 2019, at 1:03 PM, Mark Abraham <mark.j.abraham at gmail.com> wrote:
Hi,
The most likely explanation here is that the starting configuration isn't
very happy (box of wrong size? equilibrating with the unsuitable
parrinello-rahman algorithm?). The CPU-only run does a different domain
decomposition than the run with GPUs, leading to a different relaxation
trajectory, which happens to have smaller forces resulting from the unhappy
particles.
I suggest you look at the change of volume over the CPU-only NPT run, and
perhaps reconsider the way you did the initial box preparation. Of course,
if problems persist once you're in the production phase, then we should
look deeper into it!
Mark
On Wed., 12 Jun. 2019, 20:42 Hannah Magdalena Baumann, <hbaumann at uci.edu>
wrote:
Hi,
I've been running into following issue in my MD simulation:
While running the simulation on CPUs everything worked fine. But when
running the simulation with the same input files on GPUs only minimization
and NVT equilibration run successfully. When starting the NPT equilibration
the system crashes due to LINCS warnings:
Step 270, time 0.54 (ps) LINCS WARNING
relative constraint deviation after LINCS:
rms 0.000084, max 0.001286 (between atoms 2019 and 2017)
bonds that rotated more than 30 degrees:
atom 1 atom 2 angle previous, current, constraint length
1174 1173 34.2 0.1090 0.1090 0.1090
1176 1175 34.4 0.1090 0.1090 0.1090
I ran the simulation both with Gromacs 2018-3 and 2019-beta1 versions,
both runs crashed because of the same reason.
This was the command to start the simulation:
gmx grompp -f $MDP/npt.mdp -c ../NVT/nvt.gro -p $FREE_ENERGY/complex.top
-t ../NVT/nvt.cpt -o npt.tpr
gmx mdrun -s npt.tpr -deffnm npt -nt $SLURM_CPUS_PER_TASK -gpu_id
$CUDA_VISIBLE_DEVICES -nb gpu -pin on
GPU: NVIDIA GeForce GTX TITAN X, compute cap.: 5.2, ECC: no, stat:
compatible
Can you see any reason why the system crashes on the GPU? Has anyone
experienced a similar issue?
Best regards,
Hannah
--
Gromacs Users mailing list
* Please search the archive at
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
posting!
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
send a mail to gmx-users-request at gromacs.org.
--
Gromacs Users mailing list
* Please search the archive at
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send
a mail to gmx-users-request at gromacs.org.
More information about the gromacs.org_gmx-users
mailing list