[gmx-developers] 4.6 Binaries and Acceleration levels

Erik Lindahl erik at kth.se
Fri Jul 6 09:49:39 CEST 2012


It won't work as a simple multi-arch fix on the Gromacs side, and we actually did discuss this in a lot of detail in the team already. You might not remember, Szilard ;-)

The main problem isn't Gromacs, but that AMD and Intel architectures are starting to diverge pretty seriously. While the kernels are a special case that are quite performance-sensitive, this limitation is present throughout the code - there are lots of places where we get a significant speedup e.g. from Bulldozer-specific optimization, but the resulting code might require FMA instructions, and thus won't run on Intel CPUs.

Same thing with AVX. The Intel compiler does great with AVX, and can also create a separate code-path with legacy instructions, but even the we would not get SSE4.1 or similar acceleration on that generic path. GCC doesn't even support the two alternative code paths.

Anyway, whatever improvements somebody comes up with, it will not be in 4.6.

Cheers,

Erik


On Jul 6, 2012, at 9:16 AM, Berk Hess wrote:

> On 07/06/2012 05:51 AM, Nicholas Breen wrote:
>> On Thu, Jul 05, 2012 at 11:05:53PM +0200, Szilárd Páll wrote:
>>> On Thu, Jul 5, 2012 at 10:59 PM, Christoph Junghans<junghans at votca.org>  wrote:
>>>> Roland has a good point, but Debian and Fedora already compile Gromacs
>>>> for different mpi version.
>>>> 
>>>> 4 acc. x 3 (serial,openmpi&  mpich2) = 12 packages!
>>>> 
>>>> @ Jussi: Would that still be possible?
>>> I guess possible is one thing and probable is a totally different one...
>>> 
>>> Perhaps it would be good to ask the Ubuntu/Debian maintainer as well...
>> I would not want to create that much package complexity (the gromacs source
>> package already builds five binary packages!), especially if it would all be on
>> only one of the many architectures supported, and I cannot think of any other
>> package in the Debian archive that operates that way -- everything that I know
>> of with multiple CPU optimizations uses run-time detection, except for the
>> unavoidable cases with packaging the Linux kernel itself.  If it is
>> functionality that is contained purely within the shared libraries, then
>> glibc's hwcap support might be a workaround if the build system can permit
>> compiling one copy of the libraries for every supported variant.  Otherwise, if
>> it's all within mdrun itself, maybe just stick to run-time detection and
>> downgrade or eliminate the warnings it issues for "suboptimal" CPUs?
>> 
>> 
> Erik's non-bonded kernels are present as source in all flavors.
> My non-bonded kernel sources can easily be put in in all flavors.
> Then we would still need compilation and call selection support.
> But there is also x86 SIMD code in the PME and bonded code,
> as well as in some other parts of the code. Unless we have a
> high level way of dealing with this, things with get messy.
> Also the compiler can add e.g. AVX instructions in plain C code.
> So if it can be handled by compiling different shared libraries,
> that would be far simpler. All compute intensive functionality
> is in the shared libraries, so that's no issue.
> 
> Cheers,
> 
> Berk
> 
> -- 
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-developers-request at gromacs.org.
> 




More information about the gromacs.org_gmx-developers mailing list