{Spam?} Re: [gmx-users] Re: wierd behavior of mdrun
Paymon Pirzadeh
ppirzade at ucalgary.ca
Tue Sep 8 22:04:49 CEST 2009
OK! I tested your suggestion about frequent output. As you had
predicted, it crashed immediately. But the message I got from the
cluster was
PBS Job Id: 94865.orca1.ibb
Job Name: AFPIII_NVT275test
job deleted
Job deleted at request of root at elder2.ibb
MOAB_INFO: job was rejected - job violates qos configuration 'job
'94865' violates MINPROC policy of 4 (R: 1, U: 0)'
This is despite the fact that in my submission script, the number of
CPUs are 8! As I had mentioned earlier, same script is used for my water
systems.
Regards,
Payman
On Fri, 2009-09-04 at 19:29 -0400, Justin A. Lemkul wrote:
> Have you tried my suggestion from the last message of setting frequent output?
> Could your system just be collapsing at the outset of the simulation? Setting
> nstxout = 1 would catch something like this.
>
> There is nothing special about treating a protein in parallel vs. a system of
> water. Since a system of water runs just fine, it seems even more likely to me
> that your system is simply crashing immediately, rather than a problem with
> Gromacs or the MPI implementation.
>
> -Justin
>
> Paymon Pirzadeh wrote:
> > Regarding the problems I have on running protein system in parallel
> > (runs without output), When I run pure water system, everything is fine,
> > I have tested pure water systems 8 times larger than my protein system.
> > while the former runs fine, the latter has problems. I have also tested
> > pure water systems with approximately same number of sites in .gro file
> > as in my protein .gro file, and with the same input file in terms of
> > spitting outputs; they are fine.I would like to know what happens to
> > GROMACS when a protein is added to the system. The cluster admin has not
> > get back to me, but I still want to check there is no problem with my
> > setup! (although my system runs fine in serial mode).
> > Regards,
> >
> > Payman
> >
> >
> >
> > On Fri, 2009-08-28 at 16:41 -0400, Justin A. Lemkul wrote:
> >> Payman Pirzadeh wrote:
> >>> There is sth strange about this problem which I suspect it might be due to
> >>> the mdp file and input. I can run the energy minimization without any
> >>> problems (I submit the job and it apparently works using the same submission
> >>> script)! But as soon as I prepare the tpr file for MD run, then I run into
> >>> this run-without-output trouble.
> >>> Again I paste my mdp file below (I want to run an NVT run):
> >>>
> >> There isn't anything in the .mdp file that suggests you wouldn't get any output.
> >> The output of mdrun is buffered, so depending on your settings, you may have
> >> more frequent output during energy minimization. There may be some problem with
> >> the MPI implementation in buffering and communicating data properly. That's a
> >> bit of a guess, but it could be happening.
> >>
> >> Definitely check with the cluster admin to see if there are any error messages
> >> reported for the jobs you submitted.
> >>
> >> Another test you could do to force a huge amount of data would be to set all of
> >> your outputs (nstxout, nstxtcout, etc) = 1 and run a much shorter simulation (to
> >> prevent massive data output!); this would force more continuous data through the
> >> buffer.
> >>
> >> -Justin
> >>
> >>> cpp = cpp
> >>> include = -I../top
> >>> define = -DPOSRES
> >>>
> >>> ; Run control
> >>>
> >>> integrator = md
> >>> dt = 0.001 ;1 fs
> >>> nsteps = 3000000 ;3 ns
> >>> comm_mode = linear
> >>> nstcomm = 1
> >>>
> >>> ;Output control
> >>>
> >>> nstxout = 5000
> >>> nstlog = 5000
> >>> nstenergy = 5000
> >>> nstxtcout = 1500
> >>> nstvout = 5000
> >>> nstfout = 5000
> >>> xtc_grps =
> >>> energygrps =
> >>>
> >>> ; Neighbour Searching
> >>>
> >>> nstlist = 10
> >>> ns_type = grid
> >>> rlist = 0.9
> >>> pbc = xyz
> >>>
> >>> ; Electrostatistics
> >>>
> >>> coulombtype = PME
> >>> rcoulomb = 0.9
> >>> ;epsilon_r = 1
> >>>
> >>> ; Vdw
> >>>
> >>> vdwtype = cut-off
> >>> rvdw = 1.2
> >>> DispCorr = EnerPres
> >>>
> >>> ;Ewald
> >>>
> >>> fourierspacing = 0.12
> >>> pme_order = 4
> >>> ewald_rtol = 1e-6
> >>> optimize_fft = yes
> >>>
> >>> ; Temperature coupling
> >>>
> >>> tcoupl = v-rescale
> >>> ld_seed = -1
> >>> tc-grps = System
> >>> tau_t = 0.1
> >>> ref_t = 275
> >>>
> >>> ; Pressure Coupling
> >>>
> >>> Pcoupl = no
> >>> ;Pcoupltype = isotropic
> >>> ;tau_p = 1.0
> >>> ;compressibility = 5.5e-5
> >>> ;ref_p = 1.0
> >>> gen_vel = yes
> >>> gen_temp = 275
> >>> gen_seed = 173529
> >>> constraint-algorithm = Lincs
> >>> constraints = all-bonds
> >>> lincs-order = 4
> >>>
> >>> Regards,
> >>>
> >>> Payman
> >>>
> >>>
> >>> -----Original Message-----
> >>> From: gmx-users-bounces at gromacs.org [mailto:gmx-users-bounces at gromacs.org]
> >>> On Behalf Of Mark Abraham
> >>> Sent: August 27, 2009 3:32 PM
> >>> To: Discussion list for GROMACS users
> >>> Subject: Re: [gmx-users] Re: wierd behavior of mdrun
> >>>
> >>> Vitaly V. Chaban wrote:
> >>>> Then I believe you have problems with MPI.
> >>>>
> >>>> Before I experienced something alike on our old system - serial
> >>>> version worked OK but parallel one failed. The same issue was with
> >>>> CPMD by the way. Another programs worked fine. I didn't correct that
> >>>> problem...
> >>>>
> >>>> On Thu, Aug 27, 2009 at 7:14 PM, Paymon Pirzadeh<ppirzade at ucalgary.ca>
> >>> wrote:
> >>>>> Yes,
> >>>>> it works when it is run on one processor interactively!
> >>> That's fine, but it doesn't mean the problem is with the parallelism, as
> >>> Vitaly suggests. If your cluster filesystem isn't configured properly,
> >>> you will observe these symptoms. Since the submission script was the
> >>> same, MPI worked previously, so isn't likely to be the problem...
> >>>
> >>> Mark
> >>>
> >>>>> On Thu, 2009-08-27 at 09:23 +0300, Vitaly V. Chaban wrote:
> >>>>>>> I made a .tpr file for my md run without any problems (using the bottom
> >>>>>>> mdp file). My job submission script is also the same thing I used for
> >>>>>>> other jobs which had no problems. But now when I submit this .tpr file,
> >>>>>>> only an empty log file is generated! The qstat of the cluster shows
> >>> that
> >>>>>>> the job is running, also the processors are 100% engaged while I have
> >>> no
> >>>>>>> outputs!
> >>>>>> A standard guess: what about trying to run the single-processor job on
> >>>>>> the same cluster? Does it run OK?
> >>>>>>
> >>>>>>
> >>>> _______________________________________________
> >>>> gmx-users mailing list gmx-users at gromacs.org
> >>>> http://lists.gromacs.org/mailman/listinfo/gmx-users
> >>>> Please search the archive at http://www.gromacs.org/search before posting!
> >>>> Please don't post (un)subscribe requests to the list. Use the
> >>>> www interface or send it to gmx-users-request at gromacs.org.
> >>>> Can't post? Read http://www.gromacs.org/mailing_lists/users.php
> >>>>
> >>> _______________________________________________
> >>> gmx-users mailing list gmx-users at gromacs.org
> >>> http://lists.gromacs.org/mailman/listinfo/gmx-users
> >>> Please search the archive at http://www.gromacs.org/search before posting!
> >>> Please don't post (un)subscribe requests to the list. Use the
> >>> www interface or send it to gmx-users-request at gromacs.org.
> >>> Can't post? Read http://www.gromacs.org/mailing_lists/users.php
> >>>
> >>>
> >>>
> >>> _______________________________________________
> >>> gmx-users mailing list gmx-users at gromacs.org
> >>> http://lists.gromacs.org/mailman/listinfo/gmx-users
> >>> Please search the archive at http://www.gromacs.org/search before posting!
> >>> Please don't post (un)subscribe requests to the list. Use the
> >>> www interface or send it to gmx-users-request at gromacs.org.
> >>> Can't post? Read http://www.gromacs.org/mailing_lists/users.php
> >>>
> >
> >
>
More information about the gromacs.org_gmx-users
mailing list