{Spam?} Re: [gmx-users] Re: wierd behavior of mdrun

Tue Sep 8 22:04:49 CEST 2009

OK! I tested your suggestion about frequent output. As you had
predicted, it crashed immediately. But the message I got from the
cluster was 

PBS Job Id: 94865.orca1.ibb
Job Name:   AFPIII_NVT275test
job deleted
Job deleted at request of root at elder2.ibb
MOAB_INFO:  job was rejected - job violates qos configuration 'job
'94865' violates MINPROC policy of 4 (R: 1, U: 0)'

This is despite the fact that in my submission script, the number of
CPUs are 8! As I had mentioned earlier, same script is used for my water
systems.
Regards,

Payman

On Fri, 2009-09-04 at 19:29 -0400, Justin A. Lemkul wrote:
> Have you tried my suggestion from the last message of setting frequent output? 
> Could your system just be collapsing at the outset of the simulation?  Setting 
> nstxout = 1 would catch something like this.
> 
> There is nothing special about treating a protein in parallel vs. a system of 
> water.  Since a system of water runs just fine, it seems even more likely to me 
> that your system is simply crashing immediately, rather than a problem with 
> Gromacs or the MPI implementation.
> 
> -Justin
> 
> Paymon Pirzadeh wrote:
> > Regarding the problems I have on running protein system in parallel
> > (runs without output), When I run pure water system, everything is fine,
> > I have tested pure water systems 8 times larger than my protein system.
> > while the former runs fine, the latter has problems. I have also tested
> > pure water systems with approximately same number of sites in .gro file
> > as in my protein .gro file, and with the same input file in terms of
> > spitting outputs; they are fine.I would like to know what happens to
> > GROMACS when a protein is added to the system. The cluster admin has not
> > get back to me, but I still want to check there is no problem with my
> > setup! (although my system runs fine in serial mode).
> > Regards,
> > 
> > Payman
> > 
> > 
> > 
> > On Fri, 2009-08-28 at 16:41 -0400, Justin A. Lemkul wrote:
> >> Payman Pirzadeh wrote:
> >>> There is sth strange about this problem which I suspect it might be due to
> >>> the mdp file and input. I can run the energy minimization without any
> >>> problems (I submit the job and it apparently works using the same submission
> >>> script)! But as soon as I prepare the tpr file for MD run, then I run into
> >>> this run-without-output trouble.
> >>> Again I paste my mdp file below (I want to run an NVT run):
> >>>
> >> There isn't anything in the .mdp file that suggests you wouldn't get any output. 
> >>   The output of mdrun is buffered, so depending on your settings, you may have 
> >> more frequent output during energy minimization.  There may be some problem with 
> >> the MPI implementation in buffering and communicating data properly.  That's a 
> >> bit of a guess, but it could be happening.
> >>
> >> Definitely check with the cluster admin to see if there are any error messages 
> >> reported for the jobs you submitted.
> >>
> >> Another test you could do to force a huge amount of data would be to set all of 
> >> your outputs (nstxout, nstxtcout, etc) = 1 and run a much shorter simulation (to 
> >> prevent massive data output!); this would force more continuous data through the 
> >> buffer.
> >>
> >> -Justin
> >>
> >>> cpp              = cpp
> >>> include          = -I../top
> >>> define           = -DPOSRES
> >>>
> >>> ; Run control
> >>>
> >>> integrator       = md
> >>> dt               = 0.001           ;1 fs
> >>> nsteps           = 3000000         ;3 ns
> >>> comm_mode        = linear
> >>> nstcomm          = 1
> >>>
> >>> ;Output control
> >>>
> >>> nstxout          = 5000
> >>> nstlog           = 5000
> >>> nstenergy        = 5000
> >>> nstxtcout        = 1500
> >>> nstvout          = 5000
> >>> nstfout          = 5000
> >>> xtc_grps         =
> >>> energygrps       =
> >>>
> >>> ; Neighbour Searching
> >>>
> >>> nstlist          = 10
> >>> ns_type          = grid
> >>> rlist            = 0.9
> >>> pbc              = xyz
> >>>
> >>> ; Electrostatistics
> >>>
> >>> coulombtype      = PME
> >>> rcoulomb         = 0.9
> >>> ;epsilon_r        = 1
> >>>
> >>> ; Vdw
> >>>
> >>> vdwtype          = cut-off
> >>> rvdw             = 1.2
> >>> DispCorr         = EnerPres
> >>>
> >>> ;Ewald
> >>>
> >>> fourierspacing  = 0.12
> >>> pme_order       = 4
> >>> ewald_rtol      = 1e-6
> >>> optimize_fft    = yes
> >>>
> >>> ; Temperature coupling
> >>>
> >>> tcoupl           = v-rescale
> >>> ld_seed          = -1
> >>> tc-grps          = System
> >>> tau_t            = 0.1
> >>> ref_t            = 275
> >>>
> >>> ; Pressure Coupling
> >>>
> >>> Pcoupl           = no
> >>> ;Pcoupltype       = isotropic
> >>> ;tau_p            = 1.0
> >>> ;compressibility  = 5.5e-5
> >>> ;ref_p            = 1.0
> >>> gen_vel          = yes
> >>> gen_temp         = 275
> >>> gen_seed         = 173529
> >>> constraint-algorithm     = Lincs
> >>> constraints      = all-bonds
> >>> lincs-order              = 4
> >>>
> >>> Regards,
> >>>
> >>> Payman
> >>>  
> >>>
> >>> -----Original Message-----
> >>> From: gmx-users-bounces at gromacs.org [mailto:gmx-users-bounces at gromacs.org]
> >>> On Behalf Of Mark Abraham
> >>> Sent: August 27, 2009 3:32 PM
> >>> To: Discussion list for GROMACS users
> >>> Subject: Re: [gmx-users] Re: wierd behavior of mdrun
> >>>
> >>> Vitaly V. Chaban wrote:
> >>>> Then I believe you have problems with MPI.
> >>>>
> >>>> Before I experienced something alike on our old system - serial
> >>>> version worked OK but parallel one failed. The same issue was with
> >>>> CPMD by the way. Another programs worked fine. I didn't correct that
> >>>> problem...
> >>>>
> >>>> On Thu, Aug 27, 2009 at 7:14 PM, Paymon Pirzadeh<ppirzade at ucalgary.ca>
> >>> wrote:
> >>>>> Yes,
> >>>>> it works when it is run on one processor interactively!
> >>> That's fine, but it doesn't mean the problem is with the parallelism, as 
> >>> Vitaly suggests. If your cluster filesystem isn't configured properly, 
> >>> you will observe these symptoms. Since the submission script was the 
> >>> same, MPI worked previously, so isn't likely to be the problem...
> >>>
> >>> Mark
> >>>
> >>>>> On Thu, 2009-08-27 at 09:23 +0300, Vitaly V. Chaban wrote:
> >>>>>>> I made a .tpr file for my md run without any problems (using the bottom
> >>>>>>> mdp file). My job submission script is also the same thing I used for
> >>>>>>> other jobs which had no problems. But now when I submit this .tpr file,
> >>>>>>> only an empty log file is generated! The qstat of the cluster shows
> >>> that
> >>>>>>> the job is running, also the processors are 100% engaged while I have
> >>> no
> >>>>>>> outputs!
> >>>>>> A standard guess: what about trying to run the single-processor job on
> >>>>>> the same cluster? Does it run OK?
> >>>>>>
> >>>>>>
> >>>> _______________________________________________
> >>>> gmx-users mailing list    gmx-users at gromacs.org
> >>>> http://lists.gromacs.org/mailman/listinfo/gmx-users
> >>>> Please search the archive at http://www.gromacs.org/search before posting!
> >>>> Please don't post (un)subscribe requests to the list. Use the 
> >>>> www interface or send it to gmx-users-request at gromacs.org.
> >>>> Can't post? Read http://www.gromacs.org/mailing_lists/users.php
> >>>>
> >>> _______________________________________________
> >>> gmx-users mailing list    gmx-users at gromacs.org
> >>> http://lists.gromacs.org/mailman/listinfo/gmx-users
> >>> Please search the archive at http://www.gromacs.org/search before posting!
> >>> Please don't post (un)subscribe requests to the list. Use the 
> >>> www interface or send it to gmx-users-request at gromacs.org.
> >>> Can't post? Read http://www.gromacs.org/mailing_lists/users.php
> >>>
> >>>
> >>>
> >>> _______________________________________________
> >>> gmx-users mailing list    gmx-users at gromacs.org
> >>> http://lists.gromacs.org/mailman/listinfo/gmx-users
> >>> Please search the archive at http://www.gromacs.org/search before posting!
> >>> Please don't post (un)subscribe requests to the list. Use the 
> >>> www interface or send it to gmx-users-request at gromacs.org.
> >>> Can't post? Read http://www.gromacs.org/mailing_lists/users.php
> >>>
> > 
> > 
>