[gmx-developers] block_tx.c error on 3.3_rc3

Erik Lindahl lindahl at sbc.su.se
Tue Oct 4 22:10:55 CEST 2005


Hi,

Yes - the bug was a pointer that wasn't NULL-initialized. Most  
compilers zero variables anyway, but on a couple of systems this  
resulted in an immediate SEGFAULT in grompp.


Cheers,

Erik

On Oct 4, 2005, at 9:23 PM, Yiannis wrote:

> Thanks for the reply,
> YEs I have more than one job to run, it was a kind of "exercice" to  
> test if it is possible, mainly because we have a few "old and slow"  
> machines that would be interesting to put together on one job and I  
> wanted first to test on two fast but different.
> About the grompp bug on 3.3_rc3, if grompp and mdrun work as usual  
> I can presume I am not concerned or is it a kind of bug that can  
> pass unnoticed?
> Ioannis
>
> Le 4 oct. 05, à 08:42, Erik Lindahl a écrit :
>
>
>> Hi,
>>
>> Your machines are both running different architectures (x86_64 and  
>> i686) and different operating systems.
>>
>> In theory it is still possible to use both of them, but you must  
>> make sure the correct mdrun_mpi binary compiled for each of them,  
>> you must take care to load balance due to the speed difference, etc.
>>
>> In practice I'm not sure it's worth it, since you probably have  
>> more jobs to run anyway.
>>
>> There is a bug in 3.3_rc3 affecting grompp on some machine/ 
>> compiler combinations, which has been fixed in CVS.
>>
>> Cheers,
>>
>> Erik
>>
>>
>> On Oct 4, 2005, at 2:07 AM, Yiannis wrote:
>>
>>
>>> Hello,
>>> I try to run in parallel between two different machines, both  
>>> dual Xeon, one is 2x2.8GHz i686 with Mandrake, the other is  
>>> 2x3.2GHz x86_64 with Suse.
>>> Is it possible to run on the 4 CPUs?
>>> After reading the thread on the list I installed the 3.3_rc3  
>>> tarball but when I try:
>>>
>>> grompp -v -np 4
>>> mpirun -v -np4 mdrun_mpi -v -np 4
>>>
>>> on the d.villin benchmark I get the block_tx.c error:
>>>
>>> Reading file topol.tpr, VERSION 3.3_rc3 (single precision)
>>> -------------------------------------------------------
>>> Program mdrun_mpi, VERSION 3.3_rc3
>>> Source code file: block_tx.c, line: 74
>>>
>>> Fatal error:
>>> 0: size=672, len=840, rx_count=0
>>>
>>> -------------------------------------------------------
>>>
>>> I tried to run on two CPU on the same machine, it works with no  
>>> problem, I tried also to run on two CPUs, one in each machine and  
>>> I then have the same block_tx error.
>>>
>>> So, is the problem coming from the heterogeneous environment and  
>>> I should stop trying, or it is supposed to work and in that case  
>>> how?
>>>
>>> Thanks for any help,
>>>
>>> Ioannis Nicolis
>>>
>>> _______________________________________________
>>> gmx-developers mailing list
>>> gmx-developers at gromacs.org
>>> http://www.gromacs.org/mailman/listinfo/gmx-developers
>>> Please don't post (un)subscribe requests to the list. Use the www  
>>> interface or send it to gmx-developers-request at gromacs.org.
>>>
>>>
>>>
>>
>> _______________________________________________
>> gmx-developers mailing list
>> gmx-developers at gromacs.org
>> http://www.gromacs.org/mailman/listinfo/gmx-developers
>> Please don't post (un)subscribe requests to the list. Use thewww  
>> interface or send it to gmx-developers-request at gromacs.org.
>>
>>
>
> _______________________________________________
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://www.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-developers-request at gromacs.org.
>
>




More information about the gromacs.org_gmx-developers mailing list