[gmx-users] MPI_Recv invalid count and system explodes for large but not small parallelization on power6 but not opterons

chris.neale at utoronto.ca chris.neale at utoronto.ca
Wed Mar 4 04:48:34 CET 2009


Thanks Mark,

your information is always useful. In this case, the page that you  
reference appears to be empty. All I see is "There is currently no  
text in this page, you can search for this page title in other pages  
or edit this page."

Thanks also for your consideration of the massive scaling issue.

Chris.

chris.neale at utoronto.ca wrote:
> Hello,
>
> I am currently testing a large system on a power6 cluster. I have  
> compiled gromacs 4.0.4 successfully, and it appears to be working  
> fine for <64 "cores" (sic, see later). First, I notice that it runs  
> at approximately 1/2 the speed that it obtains on some older  
> opterons, which is unfortunate but acceptable. Second, I run into  
> some strange issues when I have a greater number of cores. Since  
> there are 32 cores per node with simultaneous multithreading this  
> yields 64 tasks inside one box, and I realize that these problems  
> could be MPI related.
>
> Some background:
> This test system is stable for > 100ns on an opteron so I am quite  
> confident that I do not have a problem with my topology or starting  
> structure.
>
> Compilation was successful with -O2 only when I modified the  
> ./configure file as follows, otherwise I got a stray ')' and a  
> linking error:
> [cneale at tcs-f11n05]$ diff configure.000 configure
> 5052a5053
>> ac_cv_f77_libs="-L/scratch/cneale/exe/fftw-3.1.2_aix/exec/lib  
>> -lxlf90 -L/usr/lpp/xlf/lib -lxlopt -lxlf -lxlomp_ser -lpthreads -lm  
>> -lc"

Rather than modify configure, I suggest you use a customized command
line, such as the one described here
http://wiki.gromacs.org/index.php/GROMACS_on_BlueGene. The output
config.log will have a record of what you did, too.

Sorry I can't help with the massive scaling issue.

Mark




More information about the gromacs.org_gmx-users mailing list