[gmx-developers] Re: gmx_fatal deadlock bug

Sander Pronk pronk at cbr.su.se
Fri Jan 29 11:55:16 CET 2010


I can do that when I get better next week, too. 

Sander 

On Jan 29, 2010, at 11:53 , Berk Hess wrote:

> Ah, indeed.
> But that does not use much of the functionality of the warning system.
> If I have time I can see if I can add local data structures to all
> functions using warning.
> 
> Berk
> 
> Sander Pronk wrote:
>> I may very well be wrong, but there's a warning() in strdb.c's fget_lines() that gets used in a few places, such as copyrite.c 
>> 
>> 
>> On Jan 29, 2010, at 11:35 , Berk Hess wrote:
>> 
>> 
>>> I don't think mdrun is using the warning code in gmx_fatal.
>>> Why do you think this is the case?
>>> 
>>> Berk
>>> 
>>> Sander Pronk wrote:
>>> 
>>>> Actually, it's not that hard to prove that mutexed code works - it's just that if not properly isolated, it easily becomes hard to maintain. 
>>>> 
>>>> I just checked, and it appears the warning code is only used in grompp, and a few places in gmxlib, where it's being used by mdrun. The question now is: in those instances in mdrun, do we want a global warning count, or is a local one good enough?
>>>> 
>>>> Sander
>>>> 
>>>> 
>>>> On Jan 29, 2010, at 10:59 , Berk Hess wrote:
>>>> 
>>>> 
>>>> 
>>>>> Hi,
>>>>> 
>>>>> The problem with the current code is that there are no guarantees that
>>>>> it won't deadlock.
>>>>> This might only appear very infrequently, but it would still be very
>>>>> annoying.
>>>>> So we or have to test things very thoroughly or at least simply the
>>>>> mutex locking a bit,
>>>>> for instance by getting git of the global warning variables, which are
>>>>> only used by
>>>>> two or three programs.
>>>>> 
>>>>> Berk
>>>>> 
>>>>> David van der Spoel wrote:
>>>>> 
>>>>> 
>>>>>> On 1/29/10 10:40 AM, Sander Pronk wrote:
>>>>>> 
>>>>>> 
>>>>>>> The fix looks fine; the only weird thing I see is the 'if
>>>>>>> (msg==NULL)' check in _gmx_error.
>>>>>>> I haven't seen gmx_fatal deadlock yet: what triggered it?
>>>>>>> 
>>>>>>> In general, gmx_fatal.c and futil.c contain many ugly hacks that need
>>>>>>> to go away. Especially futil.c with its dependence on a global list
>>>>>>> of open files/pipes, and its interlocking function calls, is a
>>>>>>> constant source of deadlocks or thread safety issues whenever someone
>>>>>>> wants to change something. The only real way to fix this is to change
>>>>>>> the interface to the rest of the code.
>>>>>>> The sheer amount of work involved in changing APIs that are called by
>>>>>>> most of the code in Gromacs has kept me from doing it now, however.
>>>>>>> Perhaps it's best to wait for the 5.0 branch.
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>> I plead guilty.
>>>>>> 
>>>>>> However in the case of file I/O it is not so hopeless, even though
>>>>>> someone will have to do emacs *.c. If the file I/O routines would
>>>>>> return an abstract type rather than an integer we could get rid of the
>>>>>> global variables. The compiler will be able to help to fix most
>>>>>> problems. But it is definitely after 4.1.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> Sander
>>>>>>> 
>>>>>>> 
>>>>>>> On Jan 28, 2010, at 20:05 , Szilárd Páll wrote:
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>> I have recently committed a bugfix for gmx_fatal.c that fixes a
>>>>>>>> deadlock we traced and fixed with Berk this afternoon. Basically the
>>>>>>>> debug_mutex (which, to be honest, I don't know what exactly is) was
>>>>>>>> used in locking more then one resource in different functions that
>>>>>>>> happened to call each other.
>>>>>>>> 
>>>>>>>> The reason I am writing is that there might still be some situations
>>>>>>>> in which problems might occur and it seems that gmx_fatal would need a
>>>>>>>> bit of checking and rewriting. I am not so familiar with the code so I
>>>>>>>> thought I let you know about the issue; I also left a couple of
>>>>>>>> comment where I was not sure what to do.
>>>>>>>> 
>>>>>>>> Best regards,
>>>>>>>> -- 
>>>>>>>> Szilárd
>>>>>>>> 
>>>>>>>> 
>>>>> -- 
>>>>> gmx-developers mailing list
>>>>> gmx-developers at gromacs.org
>>>>> http://lists.gromacs.org/mailman/listinfo/gmx-developers
>>>>> Please don't post (un)subscribe requests to the list. Use the 
>>>>> www interface or send it to gmx-developers-request at gromacs.org.
>>>>> 
>>>>> 
>>>> 
>>> -- 
>>> gmx-developers mailing list
>>> gmx-developers at gromacs.org
>>> http://lists.gromacs.org/mailman/listinfo/gmx-developers
>>> Please don't post (un)subscribe requests to the list. Use the 
>>> www interface or send it to gmx-developers-request at gromacs.org.
>>> 
>> 
>> 
> 
> -- 
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the 
> www interface or send it to gmx-developers-request at gromacs.org.




More information about the gromacs.org_gmx-developers mailing list