[gmx-users] excess backups, and minimization issue
Samuel Flores
samuelfloresc at gmail.com
Tue Aug 9 14:36:43 CEST 2016
Hi Justin,
I think the only line that used "srun" rather than "srun -n 1" (and was not an mdrun call) was this one:
> srun gmx_mpi grompp -f ions.mdp -c threaded-truncated_solv.gro -p topol.top -o ions.tpr
I added the -n 1 flag for this command. This was indeed the one that was generating all the backups. However following your suggestion I cannot run grompp without a preceding gmx_mpi command on my install:
[samuelf at aurora1 proteinA]$ grompp -f ions.mdp -c threaded-truncated_solv.gro -p topol.top -o ions.tpr
-bash: grompp: command not found
Anyway, I imagine with the -n 1 flag what I do is kosher enough? I don't really like doing things on the command line anyway as I find it is not repeatable enough for my taste.
I am now having a problem energy minimization:
-------------------------------------------------------
Program gmx mdrun, VERSION 5.1.2
Source code file: /local/easybuild/build/GROMACS/5.1.2/intel-2016a-hybrid/gromacs-5.1.2/src/gromacs/mdlib/constr.cpp, line: 555
Fatal error:
step 23: Water molecule starting at atom 4636051 can not be settled.
Check for bad contacts and/or reduce the timestep if appropriate.
For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors <http://www.gromacs.org/Documentation/Errors>
-------------------------------------------------------
The offending command is:
srun gmx_mpi mdrun -v -deffnm em
this is a protein which I have generated by homology modeling. I provided no water, all solvation is done by GROMACS. I suspect the problem is a mild clash in the protein which affected a neighboring water molecule. Perhaps it is best to minimize the protein alone prior to solvation. Perhaps prior to the editconf step? Or perhaps "minim.mdp" in my existing workflow should have a -DPOSRES statement? Is there a protocol for this, or a better way to diagnose the problem? I append my updated job file.
Many thanks
Sam
#!/bin/bash -l
#SBATCH -J mine
#SBATCH -N 6
#SBATCH --tasks-per-node=20
#SBATCH --exclusive
#SBATCH -A snic2015-16-49
#SBATCH -t 168:00:00
# tried -N12. timed out. trying 6 now.
# Disable backups. These have been causing problems for unclear reasons
export GMX_MAXBACKUP=-1
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH
# End part that is written in /home/samuelf/svn/breeder/src/Breed.cpp:87
# From here down, This file is generated in /home/samuelf/svn/breeder/src/MysqlConnection.cpp:97
cd /lunarc/nobackup/users/samuelf/proteinA-mine/
echo " now working in : /lunarc/nobackup/users/samuelf/proteinA-mine/"
cp /home/samuelf/svn/breeder//singleMutantFiles/ions.mdp /lunarc/nobackup/users/samuelf/proteinA-mine/ ;
cp /home/samuelf/svn/breeder//singleMutantFiles/md.mdp /lunarc/nobackup/users/samuelf/proteinA-mine/ ;
#cp /home/samuelf/svn/breeder//singleMutantFiles/mdout.mdp /lunarc/nobackup/users/samuelf/proteinA-mine/ ;
cp /home/samuelf/svn/breeder//singleMutantFiles/minim.mdp /lunarc/nobackup/users/samuelf/proteinA-mine/ ;
cp /home/samuelf/svn/breeder//singleMutantFiles/npt.mdp /lunarc/nobackup/users/samuelf/proteinA-mine/ ;
cp /home/samuelf/svn/breeder//singleMutantFiles/nvt.mdp /lunarc/nobackup/users/samuelf/proteinA-mine/ ;
# Write cluster-specific configuration and module load commands..
# The following portion is being read from >/home/samuelf/svn/breeder//singleMutantFiles/gromacs-commands.txt<
#export OMP_NUM_THREADS=1
# Comment in original file: /home/samuelf/projects/1FC2.domainZ/gromacs-commands.txt
#module load intel/2016a
module load icc/2016.1.150-GCC-4.9.3-2.25 impi/5.1.2.150
module load GROMACS/5.1.2-hybrid
# End portion from >/home/samuelf/svn/breeder//singleMutantFiles/gromacs-commands.txt<
echo 6 > temp.txt
echo 1 >> temp.txt
echo " Check 1"
cat temp.txt | srun -n 1 gmx_mpi pdb2gmx -f threaded-truncated.pdb -o threaded-truncated_processed.gro -ignh
echo " Check 2"
#6: Amber sb99 , 3 point TIP3P water model: force field was selected based on this benchmark article: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2905107/ <http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2905107/>
srun -n 1 gmx_mpi editconf -f threaded-truncated_processed.gro -o threaded-truncated_newbox.gro -c -d 1.0 -bt cubic
echo " Check 3"
srun -n 1 gmx_mpi solvate -cp threaded-truncated_newbox.gro -cs spc216.gro -o threaded-truncated_solv.gro -p topol.top
# threaded-truncated_solv.gro has full length SpA:
echo " Check 4"
srun -n 1 gmx_mpi grompp -f ions.mdp -c threaded-truncated_solv.gro -p topol.top -o ions.tpr
# threaded-truncated_solv_ions.gro has the full length SpA:
echo " Check 5"
echo 13 | srun -n 1 gmx_mpi genion -s ions.tpr -o threaded-truncated_solv_ions.gro -p topol.top -pname NA -neutral
# previously had erroneous srun -n 1gmx_mpi ... :
echo " Check 6"
srun -n 1 gmx_mpi grompp -f minim.mdp -c threaded-truncated_solv_ions.gro -p topol.top -o em.tpr
echo " Check 7"
srun gmx_mpi mdrun -v -deffnm em
# em.gro has only one domain for some reason
echo " Check 8"
srun -n 1 gmx_mpi grompp -f nvt.mdp -c em.gro -p topol.top -o nvt.tpr
echo " Check 9"
srun gmx_mpi mdrun -deffnm nvt
echo " Check 10"
srun -n 1 gmx_mpi grompp -f npt.mdp -c nvt.gro -t nvt.cpt -p topol.top -o npt.tpr
echo " Check 11"
srun gmx_mpi mdrun -deffnm npt
echo " Check 12"
srun -n 1 gmx_mpi grompp -f md.mdp -c npt.gro -t npt.cpt -p topol.top -o md_0_1.tpr
echo " Check 13"
srun gmx_mpi mdrun -deffnm md_0_1
# insert DSSP commands here
# End of GROMACS command generator./home/samuelf/svn/breeder/src/MysqlConnection.cpp:161
# Returning from /home/samuelf/svn/breeder/src/MysqlConnection.cpp:163
> On Aug 4, 2016, at 00:13, Justin Lemkul <jalemkul at vt.edu <mailto:jalemkul at vt.edu>> wrote:
>
>
>
> On 8/3/16 6:02 PM, Samuel Flores wrote:
>>
>>
>> sorry wrong subject line!
>>
>> Guys,
>>
>> I have been plagued with an odd backup issue that I don't understand. Gromacs seems to be making scads of successive backups of certain files, even in a single run. This often leads to death due to an excess of backups. Here is the first error:
>>
>> Program gmx grompp, VERSION 5.1.2
>> Source code file: /local/easybuild/build/GROMACS/5.1.2/intel-2016a-hybrid/gromacs-5.1.2/src/gromacs/utility/futil.cpp, line: 409
>>
>> Fatal error:
>> Won't make more than 99 backups of ions.tpr for you.
>> The env.var. GMX_MAXBACKUP controls this maximum, -1 disables backups.
>> For more information and tips for troubleshooting, please check the GROMACS
>> website at http://www.gromacs.org/Documentation/E <http://www.gromacs.org/Documentation/E> <http://www.gromacs.org/Documentation/E <http://www.gromacs.org/Documentation/E>>rrors
>> -------------------------------------------------------
>>
>> Halting parallel program gmx grompp on rank 67 out of 120
>> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 67
>> Calculating fourier grid dimensions for X Y Z
>>
>> .. which I believe is due to this command:
>>
>> gmx_mpi grompp -f ions.mdp -c threaded-truncated_solv.gro -p topol.top -o ions.tpr
>>
>>
>> For now I set export GMX_MAXBACKUP=-1. But it would be nice to know what is actually happening. Can anyone help? I append my SLURM job file below.
>>
>> I would also like it if someone told me what is up with these ranks .. Not having worked much with MPI, I have the vague impression that these are processes or threads. In any case why are these processes being killed? Is this normal?
>>
>
> The only GROMACS program that benefits from MPI is mdrun. You're effectively launching 120 instances of every command before that, which serves no purpose and actively leads to the fatal error. Run preparation steps locally, then ship the .tpr off to the cluster for the actual calculation.
>
> -Justin
>
>>
>> Many thanks,
>>
>> Sam
>>
>>
>>
>> [samuelf at aurora1 proteinA-mine]$ cat job.proteinA-mine
>> #!/bin/bash -l
>> #SBATCH -J mine
>> #SBATCH -N 6
>> #SBATCH --tasks-per-node=20
>> #SBATCH --exclusive
>> #SBATCH -A snic2015-16-49
>> #SBATCH -t 168:00:00
>> # tried -N12. timed out. trying 6 now.
>>
>> # Disable backups. These have been causing problems for unclear reasons
>> export GMX_MAXBACKUP=-1
>>
>> export LD_LIBRARY_PATH=$LD_LIBRARY_PATH
>> # End part that is written in /home/samuelf/svn/breeder/src/Breed.cpp:87
>>
>> # From here down, This file is generated in /home/samuelf/svn/breeder/src/MysqlConnection.cpp:97
>> cd /lunarc/nobackup/users/samuelf/proteinA-mine/
>> echo " now working in : /lunarc/nobackup/users/samuelf/proteinA-mine/"
>> cp /home/samuelf/svn/breeder//singleMutantFiles/ions.mdp /lunarc/nobackup/users/samuelf/proteinA-mine/ ;
>> cp /home/samuelf/svn/breeder//singleMutantFiles/md.mdp /lunarc/nobackup/users/samuelf/proteinA-mine/ ;
>> #cp /home/samuelf/svn/breeder//singleMutantFiles/mdout.mdp /lunarc/nobackup/users/samuelf/proteinA-mine/ ;
>> cp /home/samuelf/svn/breeder//singleMutantFiles/minim.mdp /lunarc/nobackup/users/samuelf/proteinA-mine/ ;
>> cp /home/samuelf/svn/breeder//singleMutantFiles/npt.mdp /lunarc/nobackup/users/samuelf/proteinA-mine/ ;
>> cp /home/samuelf/svn/breeder//singleMutantFiles/nvt.mdp /lunarc/nobackup/users/samuelf/proteinA-mine/ ;
>> # Write cluster-specific configuration and module load commands..
>> # The following portion is being read from >/home/samuelf/svn/breeder//singleMutantFiles/gromacs-commands.txt<
>>
>> #export OMP_NUM_THREADS=1
>> # Comment in original file: /home/samuelf/projects/1FC2.domainZ/gromacs-commands.txt
>> #module load intel/2016a
>> module load icc/2016.1.150-GCC-4.9.3-2.25 impi/5.1.2.150
>> module load GROMACS/5.1.2-hybrid
>>
>>
>> # End portion from >/home/samuelf/svn/breeder//singleMutantFiles/gromacs-commands.txt<
>> echo 6 > temp.txt
>> echo 1 >> temp.txt
>> echo " Check 1"
>> cat temp.txt | srun -n 1 gmx_mpi pdb2gmx -f threaded-truncated.pdb -o threaded-truncated_processed.gro -ignh
>> echo " Check 2"
>> #6: Amber sb99 , 3 point TIP3P water model: force field was selected based on this benchmark article: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2905107/ <http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2905107/> <http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2905107/ <http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2905107/>>
>> srun gmx_mpi editconf -f threaded-truncated_processed.gro -o threaded-truncated_newbox.gro -c -d 1.0 -bt cubic
>> echo " Check 3"
>> srun -n 1 gmx_mpi solvate -cp threaded-truncated_newbox.gro -cs spc216.gro -o threaded-truncated_solv.gro -p topol.top
>> # threaded-truncated_solv.gro has full length SpA:
>> echo " Check 4"
>> srun gmx_mpi grompp -f ions.mdp -c threaded-truncated_solv.gro -p topol.top -o ions.tpr
>> # threaded-truncated_solv_ions.gro has the full length SpA:
>> echo " Check 5"
>> echo 13 | srun -n 1 gmx_mpi genion -s ions.tpr -o threaded-truncated_solv_ions.gro -p topol.top -pname NA -neutral
>> # previously had erroneous srun -n 1gmx_mpi ... :
>> echo " Check 6"
>> srun -n 1 gmx_mpi grompp -f minim.mdp -c threaded-truncated_solv_ions.gro -p topol.top -o em.tpr
>> echo " Check 7"
>> srun gmx_mpi mdrun -v -deffnm em
>> # em.gro has only one domain for some reason
>> echo " Check 8"
>> srun -n 1 gmx_mpi grompp -f nvt.mdp -c em.gro -p topol.top -o nvt.tpr
>> echo " Check 9"
>> srun gmx_mpi mdrun -deffnm nvt
>> echo " Check 10"
>> srun -n 1 gmx_mpi grompp -f npt.mdp -c nvt.gro -t nvt.cpt -p topol.top -o npt.tpr
>> echo " Check 11"
>> srun gmx_mpi mdrun -deffnm npt
>> echo " Check 12"
>> srun -n 1 gmx_mpi grompp -f md.mdp -c npt.gro -t npt.cpt -p topol.top -o md_0_1.tpr
>> echo " Check 13"
>> srun gmx_mpi mdrun -deffnm md_0_1
>> # insert DSSP commands here
>> # End of GROMACS command generator./home/samuelf/svn/breeder/src/MysqlConnection.cpp:161
>> # Returning from /home/samuelf/svn/breeder/src/MysqlConnection.cpp:163
>>
>>
>> Samuel Coulbourn Flores
>> Computational and Systems Biology Program
>> Department of Cell and Molecular Biology
>> Uppsala University
>>
>
> --
> ==================================================
>
> Justin A. Lemkul, Ph.D.
> Ruth L. Kirschstein NRSA Postdoctoral Fellow
>
> Department of Pharmaceutical Sciences
> School of Pharmacy
> Health Sciences Facility II, Room 629
> University of Maryland, Baltimore
> 20 Penn St.
> Baltimore, MD 21201
>
> jalemkul at outerbanks.umaryland.edu <mailto:jalemkul at outerbanks.umaryland.edu> | (410) 706-7441
> http://mackerell.umaryland.edu/~jalemkul <http://mackerell.umaryland.edu/~jalemkul>
>
> ==================================================
> --
> Gromacs Users mailing list
>
> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.
Samuel Coulbourn Flores
Computational and Systems Biology Program
Department of Cell and Molecular Biology
Uppsala University
Cell: +46 706.000.464
Phone: +46 (0) 18-471 45 36
Skype: samuelfloresc
Office: BMC C8:217a
Deliveries: BMC Box 596, Uppsala 75124
Samuel Coulbourn Flores
Computational and Systems Biology Program
Department of Cell and Molecular Biology
Uppsala University
More information about the gromacs.org_gmx-users
mailing list