[gmx-developers] GROMACS OpenCL on Gallium

Vedran Miletić rivanvx at gmail.com
Sat Nov 28 10:32:09 CET 2015


Same issue with radeonsi, interestingly enough. According to [1],
Bonaire is Sea Islands; according to [2], it should support global
(and local) atomics.

Running on 1 node with total 6 cores, 6 logical cores, 1 compatible GPU
Hardware detected:
  CPU info:
    Vendor: AuthenticAMD
    Brand:  AMD FX(tm)-6300 Six-Core Processor
    SIMD instructions most likely to fit this hardware: AVX_128_FMA
    SIMD instructions selected at GROMACS compile time: AVX_128_FMA
  GPU info:
    Number of GPUs detected: 1
    #0: name: AMD BONAIRE (DRM 2.43.0, LLVM 3.8.0), vendor: AMD,
device version: OpenCL 1.1 MESA 11.2.0-devel, stat: compatible

Reading file em.tpr, VERSION 5.1-dev-20150219-7c30fcf-unknown (single precision)
Note: file tpx version 100, software tpx version 106
Using 1 MPI thread
Using 6 OpenMP threads

1 compatible GPU is present, with ID 0
1 GPU auto-selected for this run.
Mapping of GPU ID to the 1 PP rank in this node: 0

Selecting kernel for AMD
LLVM ERROR: Cannot select: t28: i32,ch = AtomicCmpSwap<Volatile
LDST4[%522(addrspace=1)]> t0, t2, t4, t9
  t2: i64,ch = CopyFromReg t0, Register:i64 %vreg299
    t1: i64 = Register %vreg299
  t4: i32,ch = CopyFromReg t0, Register:i32 %vreg302
    t3: i32 = Register %vreg302
  t9: i32 = bitcast t8
    t8: f32 = fadd t7, t5
      t7: f32,ch = CopyFromReg t0, Register:f32 %vreg298
        t6: f32 = Register %vreg298
      t5: f32 = bitcast t4
        t4: i32,ch = CopyFromReg t0, Register:i32 %vreg302
          t3: i32 = Register %vreg302
In function: nbnxn_kernel_ElecEw_VdwLJ_F_opencl

Regards,
Vedran

[1] http://xorg.freedesktop.org/wiki/RadeonFeature/
[2] http://dri.freedesktop.org/wiki/GalliumCompute/

2015-11-27 21:05 GMT+01:00 Vedran Miletić <rivanvx at gmail.com>:
> 2015-11-27 20:50 GMT+01:00 Szilárd Páll <pall.szilard at gmail.com>:
>> Thanks for getting back! Without CAS we won't get very far, I'm afraid. The
>> kernels would need to be rewritten to dump forces to global memory and
>> reduce them later which will likely completely kill performance (and it's a
>> hassle to do).
>>
>>
>>>
>>> I can see what I can do, but perhaps StreamComputing guys would be the
>>> ones to ask here because it is their code.
>>
>>
>> I know the code fairly well, but I double-checked to be sure and
>> (unfortunately) image support never got fixed, see:
>> http://redmine.gromacs.org/projects/gromacs/repository/revisions/master/entry/src/gromacs/mdlib/nbnxn_ocl/nbnxn_ocl_data_mgmt.cpp#L208
>> http://redmine.gromacs.org/projects/gromacs/repository/revisions/master/entry/src/gromacs/mdlib/nbnxn_ocl/nbnxn_ocl_data_mgmt.cpp#L423
>>
>> So image support is definitely not in the way of using radeonsi - and even
>> if we implement it, keeping a version with simple gmem direct acesses for
>> the latter parameter lookup and the analytical estimate iso tabulated Ewald
>> correction (former) will always remain as an option.
>>
>> Cheers,
>> --
>> Szilárd
>>
>
> If it works, that would be uber awesome... I have to compile the stack
> on radeonsi machine... I will try to do it over the weekend, don't
> really want to wait next week to find out if it works or not.
>
> Regards,
> Vedran
>
> --
> Vedran Miletić
> http://vedranmileti.ch/



-- 
Vedran Miletić
http://vedranmileti.ch/


More information about the gromacs.org_gmx-developers mailing list