[gmx-developers] Re: gromacs-4.0.4

Bernhard Bandow bandow at rrzn.uni-hannover.de
Tue Mar 17 18:16:03 CET 2009


Dear developers,

During installing and testing gromacs4.0.4 together with its test suite
a number of tests failed. Unfortunately playing around with compiler
options, linking against different fft libraries as well as changing
paramters in grompp.mdp files did not solve the problem - some tests
still fail and I got stuck here.

Since other postings regarding the same topic are part of another
mailing list I would like to put an abstract here.

i)The tests were run on one node of a linux cluster employing 8 cores.
  Each node is a dual socket blade with two quad core cpus.
  /proc/cpuinfo says:

  processor       : 0
  vendor_id       : GenuineIntel
  cpu family      : 6
  model           : 23
  model name      : Intel(R) Xeon(R) CPU           E5472  @ 3.00GHz
  stepping        : 6
  cpu MHz         : 2992.496
  cache size      : 6144 KB
  physical id     : 0
  siblings        : 4
  core id         : 0
  cpu cores       : 4
  fpu             : yes
  fpu_exception   : yes
  cpuid level     : 10
  wp              : yes
  flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
  mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall
  nx lm constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr dca
  lahf_lm
  bogomips        : 5990.08
  clflush size    : 64
  cache_alignment : 64
  address sizes   : 38 bits physical, 48 bits virtual
  power management:

  ... until processor 7

  The OS is SUSE Linux Enterprise Server 10 (x86_64), version 10,
  patchlevel 2.

ii)Gromacs was compiled with Intel 11.0.069 compiler using mvapich2
   Further options in configure were:
   --enable-fortran
   --enable-mpi
   --with-fft=mkl
   --enable-shared
   The system gromacs is intended to run on uses mpiexec to initiate
   message passing. So gmxtest.pl was modified at line 31 to:

   $mdprefix = "mpiexec -np $parallel"

   In order to exclude details of compilation and linking the
   optimisation level was then reduced until -O0, and the intel-mkl was
   replaced by fftw3.1.2.

iii)As an example for failing tests in 'kernel' kernel020 fails with
  /kernel020/checkvir.out  containing:

  LJ-14            step   0:      -63.1209,  step   0:      6.50742
  Potential        step   0:      -321.601,  step   0:     -251.972
  Kinetic En.      step   0:       15.0135,  step   0:      29.0739
  Total Energy     step   0:      -306.588,  step   0:     -222.898
  Temperature      step   0:       1.93226,  step   0:      3.74185
  Pressure (bar)   step   0:      -3095.37,  step   0:     -2497.79
  Vir-XX           step   0:       1127.45,  step   0:      1075.85
  Vir-XY           step   0:        63.524,  step   0:      21.7773
  Vir-XZ           step   0:      -264.813,  step   0:     -260.354
  Vir-YX           step   0:       63.5241,  step   0:      21.7775
  Vir-YY           step   0:       803.252,  step   0:      581.702
  Vir-ZX           step   0:      -264.813,  step   0:     -260.355
  Vir-ZZ           step   0:       321.207,  step   0:      176.574

The deviations are obvious.

I hope this helps to locate the reason for the tests to fail in order to
 come to a set of tests that help to test installations of gromacs.

Best regards

Bernhard Bandow

David van der Spoel schrieb:
> Bernhard Bandow wrote:
>> Dear Prof. van der Spoel,
>>
>> as I have reported some days before we observe a number of tests in
>> gromacs-4.0.4 failing. Those of the complex category can all be fixed if
>> the hints to change parameters for electrostatics or thermostats.
>> Reducing the Compiler optimisation to -O0 does additionally fix problems
>> with two of the pbd2gmx tests. For the kernel tests we observe the same
>> pattern of failing tests that were also reported elsewhere:
>>
>> Testing kernel020 . . . FAILED. Check files in kernel020
>> Testing kernel120 . . . FAILED. Check files in kernel120
>> Testing kernel121 . . . FAILED. Check files in kernel121
>> Testing kernel122 . . . FAILED. Check files in kernel122
>> Testing kernel123 . . . FAILED. Check files in kernel123
>> Testing kernel124 . . . FAILED. Check files in kernel124
>> Testing kernel220 . . . FAILED. Check files in kernel220
>> Testing kernel221 . . . FAILED. Check files in kernel221
>> Testing kernel222 . . . FAILED. Check files in kernel222
>> Testing kernel223 . . . FAILED. Check files in kernel223
>> Testing kernel224 . . . FAILED. Check files in kernel224
>> Testing kernel320 . . . FAILED. Check files in kernel320
>> Testing kernel321 . . . FAILED. Check files in kernel321
>> Testing kernel322 . . . FAILED. Check files in kernel322
>> Testing kernel323 . . . FAILED. Check files in kernel323
>> Testing kernel324 . . . FAILED. Check files in kernel324
>>
>> Changing parameters in these cases is not successful. The situation
>> remains even if the program is linked against fftw3.1.2 libraries
>> instead of the intel mkl.
>> As a scientist it think it is very imnportant to have a tool which
>> functionality can be tested well in order to obtain results which can be
>> trusted. For my own problems this of course also depends on my abilities
>> in writing proper input.
>> My concern at the HLRN is to provide knowledge for the users of our
>> computing center how to install and run the installed software.
>> Please tell me if or how I can contribute to a solution e.g. by testing
>> or writing to people of the gromacs team without addressing only one
>> person.
>>
> 
> The best place is the gmx-developers list. You are also welcome to
> submit a bugzilla.
> 
> There were some rumors that the kernels fail in parallel, but not
> sequentially but I haven't tested.
> 
> 
>> Best regards
>>
>> Bernhard Bandow
> 




More information about the gromacs.org_gmx-developers mailing list