[gmx-developers] gromacs 5.1rc1 OpenCL problem with Parrinello-Rahman

Erik Lindahl erik.lindahl at gmail.com
Thu Jul 16 18:59:09 CEST 2015


Hi Carlo,

IIRC, I managed to get it working on a Mac Pro with AMD Firepro D300 GPUs. Please post the contents of the top of your logfile (where it says everything about the compilers & config), and mention what hardware you tried it on - then we’ll see if we can reproduce it.

Cheers,

Erik

From: Carlo Camilloni <carlo.camilloni at gmail.com>
Reply: gmx-developers at gromacs.org <gmx-developers at gromacs.org>>
Date: 16 Jul 2015 at 18:52:21
To: gromacs.org_gmx-developers at maillist.sys.kth.se <gromacs.org_gmx-developers at maillist.sys.kth.se>>
Subject:  Re: [gmx-developers] gromacs 5.1rc1 OpenCL problem with Parrinello-Rahman  

Hi,  

I tested the OpenCL kernel on my macbook (nvidia gpu) and here it produces the correct forces,  
so it could be a problem related to amd+osx, or maybe to some specific compiler/os version  

Carlo  


> On 15 Jul 2015, at 17:42, Carlo Camilloni <carlo.camilloni at gmail.com> wrote:  
>  
> Hi,  
>  
> these are the tests that fail:  
>  
> FAILED. Check checkpot.out (12 errors), checkforce.out (3516 errors) file(s) in dd121 for dd121  
> FAILED. Check checkpot.out (10 errors), checkforce.out (4027 errors) file(s) in nbnxn-energy-groups for nbnxn-energy-groups  
> FAILED. Check checkpot.out (26 errors), checkforce.out (2998 errors) file(s) in nbnxn-free-energy for nbnxn-free-energy  
> FAILED. Check checkpot.out (26 errors), checkforce.out (2998 errors) file(s) in nbnxn-free-energy-vv for nbnxn-free-energy-vv  
> FAILED. Check checkpot.out (11 errors), checkforce.out (4039 errors) file(s) in nbnxn-ljpme-geometric for nbnxn-ljpme-geometric  
> FAILED. Check checkpot.out (14 errors), checkforce.out (52 errors) file(s) in nbnxn-ljpme-LB for nbnxn-ljpme-LB  
> FAILED. Check checkpot.out (14 errors), checkforce.out (52 errors) file(s) in nbnxn-ljpme-LB-geometric for nbnxn-ljpme-LB-geometric  
> FAILED. Check checkpot.out (10 errors), checkforce.out (4029 errors) file(s) in nbnxn-vdw-force-switch for nbnxn-vdw-force-switch  
> FAILED. Check checkpot.out (10 errors), checkforce.out (4032 errors) file(s) in nbnxn-vdw-potential-switch for nbnxn-vdw-potential-switch  
> FAILED. Check checkpot.out (4 errors), checkforce.out (250 errors) file(s) in nbnxn-vdw-potential-switch-argon for nbnxn-vdw-potential-switch-argon  
> FAILED. Check checkpot.out (10 errors), checkforce.out (4027 errors) file(s) in nbnxn_pme for nbnxn_pme  
> FAILED. Check checkpot.out (10 errors), checkforce.out (4027 errors) file(s) in nbnxn_pme_order5 for nbnxn_pme_order5  
> FAILED. Check checkpot.out (10 errors), checkforce.out (4027 errors) file(s) in nbnxn_pme_order6 for nbnxn_pme_order6  
> FAILED. Check checkpot.out (9 errors), checkforce.out (4028 errors) file(s) in nbnxn_rf for nbnxn_rf  
> FAILED. Check checkpot.out (2 errors), checkforce.out (4 errors) file(s) in nbnxn_rzero for nbnxn_rzero  
> FAILED. Check mdrun.out, md.log file(s) in nbnxn_vsite for nbnxn_vsite  
> FAILED. Check checkpot.out (13 errors), checkforce.out (15512 errors) file(s) in octahedron for octahedron  
> FAILED. Check mdrun.out, md.log file(s) in position-restraints for position-restraints  
> FAILED. Check mdrun.out, md.log file(s) in pull_constraint for pull_constraint  
> FAILED. Check checkpot.out (10 errors), checkforce.out (4021 errors) file(s) in pull_cylinder for pull_cylinder  
> FAILED. Check checkpot.out (11 errors), checkforce.out (39054 errors) file(s) in swap_x for swap_x  
> FAILED. Check checkpot.out (11 errors), checkforce.out (39053 errors) file(s) in swap_y for swap_y  
> FAILED. Check checkpot.out (12 errors), checkforce.out (39054 errors) file(s) in swap_z for swap_z  
> 23 out of 60 complex tests FAILED  
> FAILED. Check mdrun.out, md.log file(s) in expanded for expanded  
> FAILED. Check mdrun.out, md.log file(s) in transformAtoB for transformAtoB  
> 2 out of 10 freeenergy tests FAILED  
>  
>  
> Carlo  
>  
>  
>>  
>>  
>> Message: 4  
>> Date: Wed, 15 Jul 2015 15:35:13 +0000  
>> From: Mark Abraham <mark.j.abraham at gmail.com>  
>> To: gmx-developers at gromacs.org,  
>> gromacs.org_gmx-developers at maillist.sys.kth.se  
>> Subject: Re: [gmx-developers] gromacs 5.1rc1 OpenCL problem with  
>> Parrinello-Rahman  
>> Message-ID:  
>> <CAMNuMATveVRRyBBwn312xrY+w3M7deC2Hs3A7PZnaeugkw+VVA at mail.gmail.com>  
>> Content-Type: text/plain; charset="utf-8"  
>>  
>> Hi,  
>>  
>> Thanks. If a difference of that magnitude can be seen, then it should also  
>> show up when running the regressiontests (e.g. cmake  
>> -DREGRESSIONTEST_DOWNLOAD=on and then make check) as a failure  
>> of complex/nbnxn-ljpme-LB (which is the only P-R test that can run on the  
>> GPU). If other tests fail, then the problem is actually more widespread.  
>>  
>> It may be that there is some issue with some part of the Mac+clang+OpenCL  
>> stack - we didn't target it during development, and at the last minute when  
>> Erik was unexpectedly able to get it to compile. I don't know if he got  
>> tests to pass. Erik?  
>>  
>> Mark  
>>  
>> On Wed, Jul 15, 2015 at 5:22 PM Carlo Camilloni <carlo.camilloni at gmail.com>  
>> wrote:  
>>  
>>>  
>>> Dear Mark and Szilard,  
>>>  
>>> thanks for your answer. I filed a bug in redmine but in the meantime I was  
>>> running more tests and I am a bit scared by what I found:  
>>>  
>>> what I have done is the following I have performed a single step run with  
>>> gmx51-rc1 compiled with cuda, again clang and so on  
>>> and compared the forces on the first step with -nb cpu or not (I am using  
>>> -pforce 1), the forces are identical:  
>>>  
>>> ie.:  
>>>  
>>> cuda-gpu  
>>>  
>>> step 0 atom 1 x 3.940 5.612 2.226 force 1.90839e+03  
>>> step 0 atom 2 x 3.852 5.659 2.211 force 4.24845e+02  
>>> step 0 atom 3 x 3.979 5.665 2.303 force 6.89472e+02  
>>> step 0 atom 4 x 3.992 5.610 2.139 force 7.42053e+02  
>>>  
>>>  
>>> cpu:  
>>>  
>>> step 0 atom 1 x 3.940 5.612 2.226 force 1.90839e+03  
>>> step 0 atom 2 x 3.852 5.659 2.211 force 4.24845e+02  
>>> step 0 atom 3 x 3.979 5.665 2.303 force 6.89472e+02  
>>> step 0 atom 4 x 3.992 5.610 2.139 force 7.42053e+02  
>>>  
>>> if I do the same test on the version compiled with OpenCL  
>>>  
>>> cpu:  
>>>  
>>> (the former are done on my macbook pro avx2_256 this latter on a MacPro  
>>> avx_256, this should  
>>> explain the small differences in the forces)  
>>>  
>>> step 0 atom 1 x 3.940 5.612 2.226 force 1.90838e+03  
>>> step 0 atom 2 x 3.852 5.659 2.211 force 4.24848e+02  
>>> step 0 atom 3 x 3.979 5.665 2.303 force 6.89470e+02  
>>> step 0 atom 4 x 3.992 5.610 2.139 force 7.42043e+02  
>>>  
>>> opencl-gpu:  
>>> step 0 atom 1 x 3.940 5.612 2.226 force 1.48597e+03  
>>> step 0 atom 2 x 3.852 5.659 2.211 force 6.26942e+02  
>>> step 0 atom 3 x 3.979 5.665 2.303 force 8.44032e+02  
>>> step 0 atom 4 x 3.992 5.610 2.139 force 7.92786e+02  
>>>  
>>> I am afraid there is something wrong in OpenCL kernels.  
>>>  
>>> I am using the topol-nvt-nogen.tpr I have uploaded on redmine.  
>>>  
>>> Best,  
>>> Carlo  
>>>  
>>>  
>>>  
>>> --  
>>> Gromacs Developers mailing list  
>>>  
>>> * Please search the archive at  
>>> http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List before  
>>> posting!  
>>>  
>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists  
>>>  
>>> * For (un)subscribe requests visit  
>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers  
>>> or send a mail to gmx-developers-request at gromacs.org.  
>>>  
>> -------------- next part --------------  
>> An HTML attachment was scrubbed...  
>> URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-developers/attachments/20150715/fb1d4126/attachment.html>  
>>  
>> ------------------------------  
>>  
>> --  
>> Gromacs Developers mailing list  
>>  
>> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List before posting!  
>>  
>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists  
>>  
>> * For (un)subscribe requests visit  
>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers or send a mail to gmx-developers-request at gromacs.org.  
>>  
>> End of gromacs.org_gmx-developers Digest, Vol 135, Issue 17  
>> ***********************************************************  
>  

--  
Gromacs Developers mailing list  

* Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List before posting!  

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists  

* For (un)subscribe requests visit  
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers or send a mail to gmx-developers-request at gromacs.org.  
-- 
Erik Lindahl <erik.lindahl at gmail.com> 
Professor of Biophysics, Dept. Biochemistry & Biophysics, Stockholm University 
Professor of Theoretical biophysics, Dept. Theoretical Physics, Royal Inst. Technology 
Science for Life Laboratory, Box 1031, 17121 Solna, Sweden
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-developers/attachments/20150716/1f6107ab/attachment-0002.html>


More information about the gromacs.org_gmx-developers mailing list