[gmx-developers] Re: gmx_fatal deadlock bug

Berk Hess hess at cbr.su.se
Fri Jan 29 10:59:27 CET 2010


Hi,

The problem with the current code is that there are no guarantees that
it won't deadlock.
This might only appear very infrequently, but it would still be very
annoying.
So we or have to test things very thoroughly or at least simply the
mutex locking a bit,
for instance by getting git of the global warning variables, which are
only used by
two or three programs.

Berk

David van der Spoel wrote:
> On 1/29/10 10:40 AM, Sander Pronk wrote:
>> The fix looks fine; the only weird thing I see is the 'if
>> (msg==NULL)' check in _gmx_error.
>> I haven't seen gmx_fatal deadlock yet: what triggered it?
>>
>> In general, gmx_fatal.c and futil.c contain many ugly hacks that need
>> to go away. Especially futil.c with its dependence on a global list
>> of open files/pipes, and its interlocking function calls, is a
>> constant source of deadlocks or thread safety issues whenever someone
>> wants to change something. The only real way to fix this is to change
>> the interface to the rest of the code.
>> The sheer amount of work involved in changing APIs that are called by
>> most of the code in Gromacs has kept me from doing it now, however.
>> Perhaps it's best to wait for the 5.0 branch.
>>
> I plead guilty.
>
> However in the case of file I/O it is not so hopeless, even though
> someone will have to do emacs *.c. If the file I/O routines would
> return an abstract type rather than an integer we could get rid of the
> global variables. The compiler will be able to help to fix most
> problems. But it is definitely after 4.1.
>
>
>> Sander
>>
>>
>> On Jan 28, 2010, at 20:05 , Szilárd Páll wrote:
>>
>>> Hi,
>>>
>>> I have recently committed a bugfix for gmx_fatal.c that fixes a
>>> deadlock we traced and fixed with Berk this afternoon. Basically the
>>> debug_mutex (which, to be honest, I don't know what exactly is) was
>>> used in locking more then one resource in different functions that
>>> happened to call each other.
>>>
>>> The reason I am writing is that there might still be some situations
>>> in which problems might occur and it seems that gmx_fatal would need a
>>> bit of checking and rewriting. I am not so familiar with the code so I
>>> thought I let you know about the issue; I also left a couple of
>>> comment where I was not sure what to do.
>>>
>>> Best regards,
>>> -- 
>>> Szilárd
>>
>
>




More information about the gromacs.org_gmx-developers mailing list