[gmx-users] MPI oversubscription
Christian H.
hypolit at googlemail.com
Tue Feb 5 14:52:17 CET 2013
Head of .log:
Gromacs version: VERSION 5.0-dev-20121213-e1fcb0a-dirty
GIT SHA1 hash: e1fcb0a3d2768a8bb28c2e4e8012123ce773e18c (dirty)
Precision: single
MPI library: MPI
OpenMP support: disabled
GPU support: disabled
invsqrt routine: gmx_software_invsqrt(x)
CPU acceleration: AVX_256
FFT library: fftw-3.3.2-sse2
Large file support: disabled
RDTSCP usage: enabled
Built on: Tue Feb 5 10:58:32 CET 2013
Built by: christian at k [CMAKE]
Build OS/arch: Linux 3.4.11-2.16-desktop x86_64
Build CPU vendor: GenuineIntel
Build CPU brand: Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz
Build CPU family: 6 Model: 42 Stepping: 7
Build CPU features: aes apic avx clfsh cmov cx8 cx16 htt lahf_lm mmx msr
nonstop_tsc pcid pclmuldq pdcm popcnt pse rdtscp sse2 sse3 sse4.1 sse4.2
ssse3 tdt
C compiler: /home/christian/opt/bin/mpicc GNU gcc (GCC) 4.8.0
20120618 (experimental)
C compiler flags: -mavx -Wextra -Wno-missing-field-initializers
-Wno-sign-compare -Wall -Wno-unused -Wunused-value -Wno-unknown-pragmas
-fomit-frame-pointer -funroll-all-loops -fexcess-precision=fast -O3
-DNDEBUG
C++ compiler: /home/christian/opt/bin/mpiCC GNU g++ (GCC) 4.8.0
20120618 (experimental)
C++ compiler flags: -mavx -std=c++0x -Wextra
-Wno-missing-field-initializers -Wnon-virtual-dtor -Wall -Wno-unused
-Wunused-value -Wno-unknown-pragmas -fomit-frame-pointer
-funroll-all-loops -fexcess-precision=fast -O3 -DNDEBUG
I will try your workaround, thanks!
2013/2/5 Berk Hess <gmx3 at hotmail.com>
>
> OK, then this is an unhandled case.
> Strange, because I am also running OpenSUSE 12.2 with the same CPU, but
> use gcc 4.7.1.
>
> I will file a bug report on redmine.
> Could you also post the header of md.log which gives all configuration
> information?
>
> To make it work for now, you can insert immediately after #ifdef
> GMX_OMPENMP:
> if (ret <= 0)
> {
> ret = gmx_omp_get_num_procs();
> }
>
>
> Cheers,
>
> Berk
>
> ----------------------------------------
> > Date: Tue, 5 Feb 2013 14:27:44 +0100
> > Subject: Re: [gmx-users] MPI oversubscription
> > From: hypolit at googlemail.com
> > To: gmx-users at gromacs.org
> >
> > None of the variables referenced here are set on my system, the print
> > statements are never executed.
> >
> > What I did:
> >
> > printf("Checking which processor variable is set");
> > #if defined(_SC_NPROCESSORS_ONLN)
> > ret = sysconf(_SC_NPROCESSORS_ONLN);
> > printf("case 1 ret = %d\n",ret);
> > #elif defined(_SC_NPROC_ONLN)
> > ret = sysconf(_SC_NPROC_ONLN);
> > printf("case 2 ret = %d\n",ret);
> > #elif defined(_SC_NPROCESSORS_CONF)
> > ret = sysconf(_SC_NPROCESSORS_CONF);
> > printf("case 3 ret = %d\n",ret);
> > #elif defined(_SC_NPROC_CONF)
> > ret = sysconf(_SC_NPROC_CONF);
> > printf("case 4 ret = %d\n",ret);
> > #endif /* End of check for sysconf argument values */
> >
> > >From /etc/issue:
> > Welcome to openSUSE 12.2 "Mantis" - Kernel \r (\l)
> > >From uname -a:
> > Linux kafka 3.4.11-2.16-desktop #1 SMP PREEMPT Wed Sep 26 17:05:00 UTC
> 2012
> > (259fc87) x86_64 x86_64 x86_64 GNU/Linux
> >
> >
> >
> > 2013/2/5 Berk Hess <gmx3 at hotmail.com>
> >
> > >
> > > Hi,
> > >
> > > This is the same cpu I have in my workstation and this case should not
> > > cause any problems.
> > >
> > > Which operating system and version are you using?
> > >
> > > If you know a bit about programming, could you check what goes wrong in
> > > get_nthreads_hw_avail
> > > in src/gmxlib/gmx_detect_hardware.c ?
> > > Add after the four "ret =" at line 434, 436, 438 and 440:
> > > printf("case 1 ret = %d\n",ret);
> > > and replace 1 by different numbers.
> > > Thus you can check if one of the 4 cases returns 0 or none of the cases
> > > is called.
> > >
> > > Cheers,
> > >
> > > Berk
> > >
> > >
> > > ----------------------------------------
> > > > Date: Tue, 5 Feb 2013 13:45:02 +0100
> > > > Subject: Re: [gmx-users] MPI oversubscription
> > > > From: hypolit at googlemail.com
> > > > To: gmx-users at gromacs.org
> > > >
> > > > >From the .log file:
> > > >
> > > > Present hardware specification:
> > > > Vendor: GenuineIntel
> > > > Brand: Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz
> > > > Family: 6 Model: 42 Stepping: 7
> > > > Features: aes apic avx clfsh cmov cx8 cx16 htt lahf_lm mmx msr
> > > nonstop_tsc
> > > > pcid pclmuldq pdcm popcnt pse rdtscp sse2 sse3 sse4.1 sse4.2 ssse3
> tdt
> > > > Acceleration most likely to fit this hardware: AVX_256
> > > > Acceleration selected at GROMACS compile time: AVX_256
> > > >
> > > > Table routines are used for coulomb: FALSE
> > > > Table routines are used for vdw: FALSE
> > > >
> > > >
> > > > >From /proc/cpuinfo (8 entries like this in total):
> > > >
> > > > processor : 0
> > > > vendor_id : GenuineIntel
> > > > cpu family : 6
> > > > model : 42
> > > > model name : Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz
> > > > stepping : 7
> > > > microcode : 0x28
> > > > cpu MHz : 1600.000
> > > > cache size : 8192 KB
> > > > physical id : 0
> > > > siblings : 8
> > > > core id : 0
> > > > cpu cores : 4
> > > > apicid : 0
> > > > initial apicid : 0
> > > > fpu : yes
> > > > fpu_exception : yes
> > > > cpuid level : 13
> > > > wp : yes
> > > > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
> > > > cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe
> syscall nx
> > > > rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology
> > > > nonstop_tsc aperfmper
> > > > f pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr
> pdcm
> > > pcid
> > > > sse4_1 sse4_2 popcnt tsc_deadline_timer aes xsave avx lahf_lm ida
> arat
> > > epb
> > > > xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid
> > > > bogomips : 6784.04
> > > > clflush size : 64
> > > > cache_alignment : 64
> > > > address sizes : 36 bits physical, 48 bits virtual
> > > > power management:
> > > >
> > > >
> > > > It also does not work on the local cluster, the output in the .log
> file
> > > is:
> > > >
> > > > Detecting CPU-specific acceleration.
> > > > Present hardware specification:
> > > > Vendor: AuthenticAMD
> > > > Brand: AMD Opteron(TM) Processor 6220
> > > > Family: 21 Model: 1 Stepping: 2
> > > > Features: aes apic avx clfsh cmov cx8 cx16 fma4 htt lahf_lm
> misalignsse
> > > mmx
> > > > msr nonstop_tsc pclmuldq pdpe1gb popcnt pse rdtscp sse2 sse3 sse4a
> sse4.1
> > > > sse4.2 ssse3 xop
> > > > Acceleration most likely to fit this hardware: AVX_128_FMA
> > > > Acceleration selected at GROMACS compile time: AVX_128_FMA
> > > > Table routines are used for coulomb: FALSE
> > > > Table routines are used for vdw: FALSE
> > > >
> > > > I am not too sure about the details for that setup, but the brand
> looks
> > > > about right.
> > > > Do you need any other information?
> > > > Thanks for looking into it!
> > > >
> > > > 2013/2/5 Berk Hess <gmx3 at hotmail.com>
> > > >
> > > > >
> > > > > Hi,
> > > > >
> > > > > This looks like our CPU detection code failed and the result is not
> > > > > handled properly.
> > > > >
> > > > > What hardware are you running on?
> > > > > Could you mail the 10 lines from the md.log file following:
> "Detecting
> > > > > CPU-specific acceleration."?
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Berk
> > > > >
> > > > >
> > > > > ----------------------------------------
> > > > > > Date: Tue, 5 Feb 2013 11:38:53 +0100
> > > > > > From: hypolit at googlemail.com
> > > > > > To: gmx-users at gromacs.org
> > > > > > Subject: [gmx-users] MPI oversubscription
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I am using the latest git version of gromacs, compiled with gcc
> > > 4.6.2 and
> > > > > > openmpi 1.6.3.
> > > > > > I start the program using the usual mpirun -np 8 mdrun_mpi ...
> > > > > > This always leads to a warning:
> > > > > >
> > > > > > Using 1 MPI process
> > > > > > WARNING: On node 0: oversubscribing the available 0 logical CPU
> > > cores per
> > > > > > node with 1 MPI processes.
> > > > > >
> > > > > > Checking the processes confirms that there is only one of the 8
> > > available
> > > > > > cores used.
> > > > > > Running mdrun_mpi with an additional debug -1:
> > > > > >
> > > > > > Detected 0 processors, will use this as the number of supported
> > > hardware
> > > > > > threads.
> > > > > > hw_opt: nt 0 ntmpi 0 ntomp 1 ntomp_pme 1 gpu_id ''
> > > > > > 0 CPUs detected, but 8 was returned by CPU_COUNTIn
> > > gmx_setup_nodecomm:
> > > > > > hostname 'myComputerName', hostnum 0
> > > > > > ...
> > > > > > 0 CPUs detected, but 8 was returned by CPU_COUNTOn rank 0,
> thread 0,
> > > core
> > > > > > 0 the affinity setting returned 0
> > > > > >
> > > > > > I also made another try by compiling gromacs using some
> experimental
> > > > > > version of gcc 4.8, which did not help in this case.
> > > > > > Is this a known problem? Obviously gromacs detects the right
> value
> > > with
> > > > > > CPU_COUNT, why is it not just taking that value?
> > > > > >
> > > > > >
> > > > > > Best regards,
> > > > > > Christian
> > > > > > --
> > > > > > gmx-users mailing list gmx-users at gromacs.org
> > > > > > http://lists.gromacs.org/mailman/listinfo/gmx-users
> > > > > > * Please search the archive at
> > > > > http://www.gromacs.org/Support/Mailing_Lists/Search before
> posting!
> > > > > > * Please don't post (un)subscribe requests to the list. Use the
> > > > > > www interface or send it to gmx-users-request at gromacs.org.
> > > > > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > > > > --
> > > > > gmx-users mailing list gmx-users at gromacs.org
> > > > > http://lists.gromacs.org/mailman/listinfo/gmx-users
> > > > > * Please search the archive at
> > > > > http://www.gromacs.org/Support/Mailing_Lists/Search before
> posting!
> > > > > * Please don't post (un)subscribe requests to the list. Use the
> > > > > www interface or send it to gmx-users-request at gromacs.org.
> > > > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > > > >
> > > > --
> > > > gmx-users mailing list gmx-users at gromacs.org
> > > > http://lists.gromacs.org/mailman/listinfo/gmx-users
> > > > * Please search the archive at
> > > http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> > > > * Please don't post (un)subscribe requests to the list. Use the
> > > > www interface or send it to gmx-users-request at gromacs.org.
> > > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > > --
> > > gmx-users mailing list gmx-users at gromacs.org
> > > http://lists.gromacs.org/mailman/listinfo/gmx-users
> > > * Please search the archive at
> > > http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> > > * Please don't post (un)subscribe requests to the list. Use the
> > > www interface or send it to gmx-users-request at gromacs.org.
> > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > >
> > --
> > gmx-users mailing list gmx-users at gromacs.org
> > http://lists.gromacs.org/mailman/listinfo/gmx-users
> > * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> > * Please don't post (un)subscribe requests to the list. Use the
> > www interface or send it to gmx-users-request at gromacs.org.
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> --
> gmx-users mailing list gmx-users at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-request at gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
More information about the gromacs.org_gmx-users
mailing list