[gmx-users] Gromacs and Radeon Nano

melichercik at nh.cas.cz melichercik at nh.cas.cz
Mon Mar 27 16:44:55 CEST 2017


Hi,
after some suggestions and some dealing with AMD I still don't have the
working solution. Even I don't have any clue for those freezes/crashes.
The double GPU was really mesa and amd driver double checks. I have tried
Ubuntu in 14.4.2, 16.4.1 and 16.4.4 versions (first two should be
supported by AMD), but problem stays without change.
So, please, could you (Szilárd or however has working some Radeon Fury
card) send me the configuration of yours computer (just to compare it). Or
even some your .tpr file which works/or may be I could send you mine to
test (up to 10 minutes - this time is enough for my case to crash in 90%).
I don't know it could influence dual socket configuration (but I don't
have other CPU to (nearly) fully load such powerful card in Gromacs) - at
least not with my calculation tasks.
And BTW which distribution are you using (with working Fury card). I have
tried only Debian/Ubuntu, but I don't thing the using Redhat/CentOS/Suse
would change something.

And the Null pointer dereference of newest AMDGPU-Pro 16.60 in call of
anything with OpenCL (even theirs clinfo) was caused by "ast" module for
onboard graphics - probably they changed something about OpenCL detection
which caused this bug (the older versions works with this kernel module).
Maybe someone find this information useful.

With freezing - with AMDGPU-PRO I was able to measure the temperature and
in state of frozen mdrun the sensors reported temperature of card as 511
deg. C. It is strange, isn't it?

Thanks.

Milan

> Hi,
>
> I have no knowledge of the instability/crash with fglrx; with
> AMDGPU-PRO I have seen strange hangs which *seem* to be kernel-space
> issues because the machine becomes unresponsive for second to minutes
> (but it typically recovers). However, I had no time to investigate
>
> Given that the extensive testing I've done was on fglrx, I'd think
> that's still the most robust choice -- though sadly unsupported and
> outdated (not even sure what's the last kernel it works with?).
>
>
> The fact that your GPU is listed twice is not something I've seen
> myself before, but it's not unreasonable if you have both the mesa and
> the amdgpu-pro OpenCL stacks installed. The former is the open source
> graphics stack + OpenCL compiler which is not fully stable for GROMACS
> use yet (mesa 13.1.x work better, but still not production ready)
>
> When it comes to AMDGPU-PRO issues, I'd strongly recommend trying to
> reach out to AMD support and voice your feedback. Do us know if you
> found a solution!
>
> Cheers,
> --
> Szilárd
>
>
> On Wed, Feb 8, 2017 at 2:23 AM,  <melichercik at leaf.nh.cas.cz> wrote:
>> Hi people,
>> I have computer with 2x Xeon E5-2660 with Radeon Fury (as someone here
>> recomended it as quite decent card (not the best one ;-) of course ).
>> System is debian (testing). I have previously R9 280X instead and it
>> worked without problem. After I replaced less power hungry Radeon Fury,
>> it freezes the machine from time to time (but at least one a day) or in
>> better case segfaults gromacs. I tryied latest fglrx (15.12) and all
>> amdgpu-pro (16.40, 16.50 and 16.60) and gromacs (I think) all versions
>> from 5.1 to 2016.2. With exception of latest Amdgpu-pro 16.60 which
>> totaly failed in runing OpenCL-enabled GMX, it gets better in time, but
>> it still crashes. So do have someone any idea wich could help?
>> Thanks in advance.
>>
>> Best,
>>
>> Milan
>>
>> PS: PSU is EVGA Supernova B2 750 W, which should be quite enough (and
>> more hungry R9 280X worked) and it should be quite good PSU (by Tier2 of
>> Tom's Hardware list)
>> PS2: maybe it is not important, but I think I have tested/simulated only
>> .tpr created in older versions of GMX than 2016.x (which I used mostly
>> for simulating)
>> PS3: why is all Radeon GPU using amdgpu-pro detected twice? ()see below)
>> (at least I have tested with Radeon Nano and RX460 cards). And the 2nd
>> one doesn't work, I have to use -gpu_id parameter.
>>
>> Hardware detected:
>>   CPU info:
>>     Vendor: Intel
>>     Brand:  Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz
>>     SIMD instructions most likely to fit this hardware: AVX_256
>>     SIMD instructions selected at GROMACS compile time: AVX_256
>>
>>   Hardware topology: Full, with devices
>>   GPU info:
>>     Number of GPUs detected: 2
>>     #0: name: Fiji, vendor: Advanced Micro Devices, Inc., device
>> version: OpenCL 1.2 AMD-APP (2236.5), stat: compatible
>>     #1: name: AMD FIJI (DRM 3.8.0 / 4.9.0-1-amd64, LLVM 3.9.1), vendor:
>> AMD, device version: OpenCL 1.1 Mesa 13.0.3, stat: compatible
>>
>> --
>> Gromacs Users mailing list
>>
>> * Please search the archive at
>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>> posting!
>>
>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>
>> * For (un)subscribe requests visit
>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> send a mail to gmx-users-request at gromacs.org.
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send
> a mail to gmx-users-request at gromacs.org.



---
Upozorneni:
Neni-li v teto zprave vyslovne uvedeno jinak, ma tato E-mailova zprava nebo jeji
prilohy pouze informativni charakter. Tato  zprava ani jeji prilohy v zadnem
ohledu ustavy AV CR, v.v.i. k  nicemu nezavazuji. Text teto zpravy nebo jejich
priloh neni navrhem na uzavreni smlouvy, ani prijetim pripadneho navrhu na
uzavreni smlouvy, ani jinym pravnim jednanim smerujicim k uzavreni jakekoliv
smlouvy a nezaklada predsmluvni odpovednost  ustavu AV CR, v.v.i.

Disclaimer:
If not expressly stated otherwise, this e-mail message (including any attached
files) is intended purely for informational purposes and does not represent a
binding agreement on the part of Institutes of CAS. The text of this message and
its attachments cannot be considered as a proposal to conclude a contract,
neither the acceptance of a proposal to conclude a contract, nor any other legal
act leading to concluding any contract; nor it does not create any
pre-contractual liability on the part of Institutes of CAS.


More information about the gromacs.org_gmx-users mailing list