[gmx-developers] thread problem on AMD 12 core chips?

David van der Spoel spoel at xray.bmc.uu.se
Fri Jan 13 14:20:24 CET 2012


On 2012-01-13 11:23, Sander Pronk wrote:
> This error is generated when the thread library can't create threads for some reason (out of memory, or some ulimit; I've never seen it before). It is probably due to the OS.
>
> There might be a chance that this is due to thread affinity API incompatibility: if the number of threads is equal to the number of hardware threads (cores, etc), thread_mpi will enforce thread affinity.
>
> Could you try with:
>
> mdrun -nt 30
>
> (or whichever number other than 32 that is compatible with domain decomposition)
>
> and report whether that works?

It does indeed work with 30 threads, but not with 32.
Strange because I thought I tested it with 16, but apparently not, 
because that works as well.

Note that the machine have 4 cpus with 8 cores (not 12 as I stated 
previously).

>
> Sander
>
>
> On 12 Jan 2012, at 14:39 , David van der Spoel wrote:
>
>> On 2012-01-12 13:31, Ake Sandgren wrote:
>>> On Thu, 2012-01-12 at 13:18 +0100, David van der Spoel wrote:
>>>> On 2012-01-12 11:36, Berk Hess wrote:
>>>>> On 01/12/2012 11:24 AM, David van der Spoel wrote:
>>>>>> On 2012-01-12 11:17, Berk Hess wrote:
>>>>>>> Which compiler is this?
>>>>>>> We get lots of warnings with gcc4.6, but we run regularly on 32 and 64
>>>>>>> core nodes.
>>>>>>
>>>>>> Thread model: posix
>>>>>> gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5)
>>>>>>
>>>>>> Is intel more reliable?
>>>>> This is not a matter of platform I would think.
>>>>> We have only used AMD platform with 32 or more MPI threads.
>>>>> I would guess this is a thread-mpi bug or a compiler issue.
>>>> Compiling with Intel C 12.0.3.174 gives the same error, but the same
>>>> pthread library is linked in.
>>>>
>>>> Other tips for debugging this?
>>>>
>>>>>
>>>>> Berk
>>>>>>>
>>>>>>> Berk
>>>>>>>
>>>>>>> On 01/12/2012 11:04 AM, David van der Spoel wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I'm trying to compile and install gromacs release-4-5-patches on a new
>>>>>>>> cluster with four 12-core AMD chips (abisko in Umea, Sweden). However
>>>>>>>> the threaded code bails out with the following message:
>>>>>>>>
>>>>>>>> Reading file topol.tpr, VERSION 4.5.5-dev-20120111-9181e (double
>>>>>>>> precision)
>>>>>>>> Starting 32 threads
>>>>>>>> tMPI error: tMPI Initialization error (in valid comm)
>>>>>>>>
>>>>>>>> First, I'm a bit confused why the code detects only 32 cores, second
>>>>>>>> it shows above error and quits.
>>>>>>>>
>>>>>>>> Any clues?
>>>
>>>
>>> Abisko's current nodes are 4-socket 8-core (the 12-cores are still under
>>> test)
>>>
>>> If you are using openmpi it does not have support for MPI threads
>>> compiled in (the openib part of openmpi doesn't support this yet) that
>>> probably explains your problem.
>>>
>> I see, that explains the 32. But gromacs uses it's own mpi-over-threads implementation that does not use any MPI whatsoever.
>>
>>
>> --
>> David van der Spoel, Ph.D., Professor of Biology
>> Dept. of Cell&  Molec. Biol., Uppsala University.
>> Box 596, 75124 Uppsala, Sweden. Phone:	+46184714205.
>> spoel at xray.bmc.uu.se    http://folding.bmc.uu.se
>> --
>> gmx-developers mailing list
>> gmx-developers at gromacs.org
>> http://lists.gromacs.org/mailman/listinfo/gmx-developers
>> Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-developers-request at gromacs.org.
>


-- 
David van der Spoel, Ph.D., Professor of Biology
Dept. of Cell & Molec. Biol., Uppsala University.
Box 596, 75124 Uppsala, Sweden. Phone:	+46184714205.
spoel at xray.bmc.uu.se    http://folding.bmc.uu.se



More information about the gromacs.org_gmx-developers mailing list