[gmx-users] multinode issue

Mark Abraham mark.j.abraham at gmail.com
Fri Dec 5 15:09:30 CET 2014


On Fri, Dec 5, 2014 at 1:37 PM, Éric Germaneau <germaneau at sjtu.edu.cn>
wrote:

> Thank you Mark,
>
> Yes this was the end of the log.
>

No, it's not the end of the .log file, it's the end of the stdout. The end
of the .log file will give us more clues about where mdrun couldn't cope
with life on this system. And the start of the log file will confirm that
you're not accidentally running the broken 4.6.2 ;-)

I tried an other input and got the same issue:
>
>    Number of CPUs detected (16) does not match the number reported by
>    OpenMP (1).
>    Consider setting the launch configuration manually!
>    Reading file yukuntest-70K.tpr, VERSION 4.6.3 (single precision)
>    [16:node328] unexpected disconnect completion event from [0:node299]
>    Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0
>    internal ABORT - process 16
>
> Actually, I'm running some test for our users, I'll talk with the admin
> about how to  return information
> to the standard sysconf() routine in the usual way.
>

OK. But if this is Linux on x86 then that really should be hard to get
wrong. So, "What kind of machine is this?" Does non-Intel MPI do a better
job? (hint, this has been true...)

Mark


> Thank you,
>
>            Éric.
>
>
> On 12/05/2014 07:38 PM, Mark Abraham wrote:
>
>> On Fri, Dec 5, 2014 at 9:15 AM, Éric Germaneau <germaneau at sjtu.edu.cn>
>> wrote:
>>
>>  Dear all,
>>>
>>> I use impi and when I submit o job (via LSF) to more than one node I get
>>> the following message:
>>>
>>>     Number of CPUs detected (16) does not match the number reported by
>>>     OpenMP (1).
>>>
>>>  That suggests this machine has not be set up to return information to
>> the
>> standard sysconf() routine in the usual way. What kind of machine is this?
>>
>>     Consider setting the launch configuration manually!
>>
>>>     Reading file test184000atoms_verlet.tpr, VERSION 4.6.2 (single
>>>     precision)
>>>
>>>  I hope that's just a 4.6.2-era .tpr, but nobody should be using 4.6.2
>> mdrun
>> because there was a bug in only that version affecting precisely these
>> kinds of issues...
>>
>>     [16:node319] unexpected disconnect completion event from [11:node328]
>>
>>>     Assertion failed in file ../../dapl_conn_rc.c at line 1179: 0
>>>     internal ABORT - process 16
>>>
>>> I submit doing
>>>
>>>     mpirun -np 32 -machinefile nodelist $EXE -v -deffnm $INPUT
>>>
>>> The machinefile looks like this
>>>
>>>     node328:16
>>>     node319:16
>>>
>>> I'm running the release 4.6.7.
>>> I do not set anything about OpenMP for this job, I'd like to have 32 MPI
>>> process.
>>>
>>> Using one node it works fine.
>>> Any hints here?
>>>
>>>  Everything seems fine. What was the end of the .log file? Can you run
>> another MPI test program thus?
>>
>> Mark
>>
>>
>>                                                                Éric.
>>>
>>> --
>>> Éric Germaneau (???), Specialist
>>> Center for High Performance Computing
>>> Shanghai Jiao Tong University
>>> Room 205 Network Center, 800 Dongchuan Road, Shanghai 200240 China
>>> M:germaneau at sjtu.edu.cn P:+86-136-4161-6480 W:http://hpc.sjtu.edu.cn
>>> --
>>> Gromacs Users mailing list
>>>
>>> * Please search the archive at http://www.gromacs.org/
>>> Support/Mailing_Lists/GMX-Users_List before posting!
>>>
>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>
>>> * For (un)subscribe requests visit
>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>>> send a mail to gmx-users-request at gromacs.org.
>>>
>>>
> --
> Éric Germaneau (???), Specialist
> Center for High Performance Computing
> Shanghai Jiao Tong University
> Room 205 Network Center, 800 Dongchuan Road, Shanghai 200240 China
> Email:germaneau at sjtu.edu.cn Mobi:+86-136-4161-6480 http://hpc.sjtu.edu.cn
>
> --
> Gromacs Users mailing list
>
> * Please search the archive at http://www.gromacs.org/
> Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>


More information about the gromacs.org_gmx-users mailing list