[gmx-users] Possible bug in parallelization, PME or load-balancing on Gromacs 4.0_rc1 ??
Bjørn Steen Sæthre
st01397 at student.uib.no
Wed Oct 1 14:41:19 CEST 2008
Hi again Berk,
I know that this particular run used no more than 1:40 hours (( I was
following it), but I am not able to cough up the complete log as it was
accidentally overwritten by a new run.
I do however have the same phenomenon in a shorter annealing trial. I
enclose the entire log in this mail, and show excerpts below.
My startup script for this run looked like this:
------------------------------
#!/bin/bash
#PBS -A fysisk
#PBS -N pmf_hydanneal_anneal2
#PBS -o pmf_hydanneal.o
#PBS -e pmf.hydanneal.err
#PBS -l walltime=1:00:00,mppwidth=50,mppnppn=4
cd /work/bjornss/pmf/structII/hydrate_annealing/anneal2
source $HOME/gmx_latest_250908/bin/GMXRC
aprun -n 50 parmdrun -s topol.tpr -maxh 1 -npme 18
exit $?
--------------------------
Now this should stop after 0.99hours = 59:24
But as you can see:
----------------------------------------------
head md.log
Log file opened on Mon Sep 29 20:11:42 2008
Host: nid00039 pid: 16507 nodeid: 0 nnodes: 50
The Gromacs distribution was built Mon Sep 29 13:25:26 CEST 2008 by
bjornss at nid00163 (Linux 2.6.16.54-0.2.5-ss x86_64)
:-) G R O M A C S (-:
Groningen Machine for Chemical Simulation
:-) VERSION 4.0_rc1 (-:
---------------------------------------------
tail md.log -n 300 (excerpt)
Step 518975: Run time exceeded 0.990 hours, will terminate the run
............................
,,,
Parallel run - timing based on wallclock.
NODE (s) Real (s) (%)
Time: 1426.000 1426.000 100.0
23:46
(Mnbf/s) (GFlops) (ns/day) (hour/ns)
Performance: 100.149 29.098 242.356 0.099
Finished mdrun on node 0 Mon Sep 29 20:35:28 2008
--------------------------
That is. I got about 40% of the allotted walltime also here.
Peculiarly 1:35 / 4:00 (hexagesimally) ~ 41%. That is the relation
betweem scheduled walltime, and actually obtained time is about the same
in both cases.
Regards
Bjørn
On Wed, 2008-10-01 at 13:25 +0200, Berk Hess wrote:
> Hi,
>
> The Cray XT4 has a torus network, but you don't get access to it as a
> torus.
> You will get assigned processors which can be anywhere in the machine
> and they are usually never in a nice cube, but there are always some
> missing.
> Therefore software, such as Gromacs, can not make use of proper
> Cartesian
>
> (torus) communication as one can for instance on a Blue Gene.
>
> I have no clue about the wallclock issue.
> Can you find out if the run took 1.35 or 4 hours?
> The start time is somewhere at the beginning of the log file.
>
> Berk
>
>
> ______________________________________________________________________
-------------- next part --------------
A non-text attachment was scrubbed...
Name: md.log
Type: text/x-log
Size: 23428 bytes
Desc: not available
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-users/attachments/20081001/f57b0993/attachment.bin>
More information about the gromacs.org_gmx-users
mailing list