[gmx-developers] thread problem on AMD 12 core chips?
Sander Pronk
pronk at cbr.su.se
Fri Jan 13 11:23:31 CET 2012
This error is generated when the thread library can't create threads for some reason (out of memory, or some ulimit; I've never seen it before). It is probably due to the OS.
There might be a chance that this is due to thread affinity API incompatibility: if the number of threads is equal to the number of hardware threads (cores, etc), thread_mpi will enforce thread affinity.
Could you try with:
mdrun -nt 30
(or whichever number other than 32 that is compatible with domain decomposition)
and report whether that works?
Sander
On 12 Jan 2012, at 14:39 , David van der Spoel wrote:
> On 2012-01-12 13:31, Ake Sandgren wrote:
>> On Thu, 2012-01-12 at 13:18 +0100, David van der Spoel wrote:
>>> On 2012-01-12 11:36, Berk Hess wrote:
>>>> On 01/12/2012 11:24 AM, David van der Spoel wrote:
>>>>> On 2012-01-12 11:17, Berk Hess wrote:
>>>>>> Which compiler is this?
>>>>>> We get lots of warnings with gcc4.6, but we run regularly on 32 and 64
>>>>>> core nodes.
>>>>>
>>>>> Thread model: posix
>>>>> gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5)
>>>>>
>>>>> Is intel more reliable?
>>>> This is not a matter of platform I would think.
>>>> We have only used AMD platform with 32 or more MPI threads.
>>>> I would guess this is a thread-mpi bug or a compiler issue.
>>> Compiling with Intel C 12.0.3.174 gives the same error, but the same
>>> pthread library is linked in.
>>>
>>> Other tips for debugging this?
>>>
>>>>
>>>> Berk
>>>>>>
>>>>>> Berk
>>>>>>
>>>>>> On 01/12/2012 11:04 AM, David van der Spoel wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I'm trying to compile and install gromacs release-4-5-patches on a new
>>>>>>> cluster with four 12-core AMD chips (abisko in Umea, Sweden). However
>>>>>>> the threaded code bails out with the following message:
>>>>>>>
>>>>>>> Reading file topol.tpr, VERSION 4.5.5-dev-20120111-9181e (double
>>>>>>> precision)
>>>>>>> Starting 32 threads
>>>>>>> tMPI error: tMPI Initialization error (in valid comm)
>>>>>>>
>>>>>>> First, I'm a bit confused why the code detects only 32 cores, second
>>>>>>> it shows above error and quits.
>>>>>>>
>>>>>>> Any clues?
>>
>>
>> Abisko's current nodes are 4-socket 8-core (the 12-cores are still under
>> test)
>>
>> If you are using openmpi it does not have support for MPI threads
>> compiled in (the openib part of openmpi doesn't support this yet) that
>> probably explains your problem.
>>
> I see, that explains the 32. But gromacs uses it's own mpi-over-threads implementation that does not use any MPI whatsoever.
>
>
> --
> David van der Spoel, Ph.D., Professor of Biology
> Dept. of Cell & Molec. Biol., Uppsala University.
> Box 596, 75124 Uppsala, Sweden. Phone: +46184714205.
> spoel at xray.bmc.uu.se http://folding.bmc.uu.se
> --
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-developers-request at gromacs.org.
More information about the gromacs.org_gmx-developers
mailing list