[gmx-users] Restarting a REST2 simulation

Qinghua Liao scorpio.liao at gmail.com
Thu Apr 9 11:28:48 CEST 2020


Hello Joseph,

You can have a check all the cpt files, to see whether they were all 
saved at the same simulation
time. Sometimes, some of the cpt files can be incomplete when saved at 
the last second.


All the best,
Qinghua

On 4/9/20 11:23 AM, Joseph, Benjamin Philipp wrote:
> Dear members of the mailing list,
>
>
> I restarted my replica exchange with solute tempering (REST2) simulation (16 replicas on 20 = 480 cores) with the following command:
>
> srun gmx_mpi mdrun -plumed plumed.dat -s topol.tpr -multidir rep0 rep1 rep2 rep3 rep4 rep5 rep6 rep7 rep8 rep9 rep10 rep11 rep12 rep13 rep14 rep15 -replex 10000 -hrex -cpi state.cpt -append
>
>
> and am running into problems as the simulation does not run many steps after the restart. I get the following error message:
>
>
> GROMACS:      gmx mdrun, version 2018.3
>
> Executable:   /usr/local/software/jureca/Stages/Devel-2018b/software/GROMACS/2018.3-intel-para-2018b-plumed/bin/gmx_mpi
>
> Data prefix:  /usr/local/software/jureca/Stages/Devel-2018b/software/GROMACS/2018.3-intel-para-2018b-plumed
>
> Working dir:  /p/scratch/cias-5/joseph1/new/S_1/sys1/neu/topos
>
> Command line:
>
>    gmx_mpi mdrun -plumed plumed.dat -s topol.tpr -multidir rep0 rep1 rep2 rep3 rep4 rep5 rep6 rep7 rep8 rep9 rep10 rep11 rep12 rep13 rep14 rep15 -replex 10000 -hrex -cpi state.cpt -append
>
>
>
> simulation part is not equal for all subsystems
>
>    subsystem 0: 4
>
>    subsystem 1: 4
>
>    subsystem 2: 4
>
>    subsystem 3: 4
>
>    subsystem 4: 4
>
>    subsystem 5: 4
>
>    subsystem 6: 4
>
>    subsystem 7: 4
>
>    subsystem 8: 4
>
>    subsystem 9: 3
>
>    subsystem 10: 3
>
>    subsystem 11: 4
>
>    subsystem 12: 3
>
>    subsystem 13: 4
>
>    subsystem 14: 4
>
>    subsystem 15: 4
>
>
> -------------------------------------------------------
>
> Program:     gmx mdrun, version 2018.3
>
> Source file: src/gromacs/mdlib/main.cpp (line 115)
>
> MPI rank:    0 (out of 480)
>
>
> -------------------------------------------------------
>
> Program:     gmx mdrun, version 2018.3
>
> Source file: src/gromacs/mdlib/main.cpp (line 115)
>
> MPI rank:    90 (out of 480)
>
>
> Fatal error:
>
>
> -------------------------------------------------------
>
>
> ...
>
>
>
>
> -------------------------------------------------------
>
> Program:     gmx mdrun, version 2018.3
>
> Source file: src/gromacs/mdlib/main.cpp (line 115)
>
> MPI rank:    300 (out of 480)
>
>
> Fatal error:
>
> The 16 subsystems are not compatible
>
>
> When I look at every md.log file they all stopped at the same simulation step and so I do not understand why they all are not at the same simulation part. Thanks a lot in advance for your help!
>
>
> Best regards,
>
>
> Benjamin
>



More information about the gromacs.org_gmx-users mailing list