[gmx-users] GPU low performance

Thu Feb 19 18:44:55 CET 2015

Szilard,
about:

Fatal error
1) Setting the number of thread-MPI threads is only supported with thread-MPI
and Gromacs was compiled without thread-MPI
For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors
-------------------------------------------------------
The error is quite clearly explains that you're trying to use mdrun's
built-in thread-MPI parallelization, but you have a binary that does
not support it. Use the MPI launching syntax instead.

Can you help me about the MPI launching syntax?  What is the suitable  
command ?

2) Have you looked at the the performance table at the end of the log?
You are wasting a large amount of runtime calculating energies every
step and this overhead comes in multiple places in the code - one of
them being the non-timed code parts which typically take <3%.

As can I reduce runtime to calculate the energies every step?
I must to modify something in mdp file ?

Thank you in advance
Carmen
-- 
Carmen Di Giovanni, PhD
Dept. of Pharmaceutical and Toxicological Chemistry
"Drug Discovery Lab"
University of Naples "Federico II"
Via D. Montesano, 49
80131 Naples
Tel.: ++39 081 678623
Fax: ++39 081 678100
Email: cdigiova at unina.it

Quoting Szilárd Páll <pall.szilard at gmail.com>:

> On Thu, Feb 19, 2015 at 11:32 AM, Carmen Di Giovanni  
> <cdigiova at unina.it> wrote:
>> Dear Szilárd,
>>
>> 1) the output of command nvidia-smi -ac 2600,758 is
>>
>> [root at localhost test_gpu]# nvidia-smi -ac 2600,758
>> Applications clocks set to "(MEM 2600, SM 758)" for GPU 0000:03:00.0
>>
>> Warning: persistence mode is disabled on this device. This settings will go
>> back to default as soon as driver unloads (e.g. last application like
>> nvidia-smi or cuda application terminates). Run with [--help | -h] switch to
>> get more information on how to enable persistence mode.
>
> run nvidia-smi -pm 1 if you want to avoid that.
>
>> Setting applications clocks is not supported for GPU 0000:82:00.0.
>> Treating as warning and moving on.
>> All done.
>> ----------------------------------------------------------------------------
>> 2) I decreased nlists to 20
>> However when I do the command:
>>  gmx_mpi mdrun -deffnm nvt -ntmpi 8 -gpu_id 00001111
>> give me a fatal error:
>>
>> GROMACS:      gmx mdrun, VERSION 5.0
>> Executable:   /opt/SW/gromacs-5.0/build/mpi-cuda/bin/gmx_mpi
>> Library dir:  /opt/SW/gromacs-5.0/share/top
>> Command line:
>>   gmx_mpi mdrun -deffnm nvt -ntmpi 8 -gpu_id 00001111
>>
>>
>> Back Off! I just backed up nvt.log to ./#nvt.log.8#
>> Reading file nvt.tpr, VERSION 5.0 (single precision)
>> Changing nstlist from 10 to 40, rlist from 1 to 1.097
>>
>>
>> -------------------------------------------------------
>> Program gmx_mpi, VERSION 5.0
>> Source code file: /opt/SW/gromacs-5.0/src/programs/mdrun/runner.c, line: 876
>>
>> Fatal error:
>> Setting the number of thread-MPI threads is only supported with thread-MPI
>> and Gromacs was compiled without thread-MPI
>> For more information and tips for troubleshooting, please check the GROMACS
>> website at http://www.gromacs.org/Documentation/Errors
>> -------------------------------------------------------
>
> The error is quite clearly explains that you're trying to use mdrun's
> built-in thread-MPI parallelization, but you have a binary that does
> not support it. Use the MPI launching syntax instead.
>
>> Halting program gmx_mpi
>>
>> gcq#223: "Jesus Not Only Saves, He Also Frequently Makes Backups." (Myron
>> Bradshaw)
>>
>> --------------------------------------------------------------------------
>> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
>> with errorcode -1.
>>
>> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
>> You may or may not see output from other processes, depending on
>> exactly when Open MPI kills them.
>> -------------------------------------------------------------------------
>>
>>
>> 4) I don't understand as I can reduce the "Rest" time
>
> Have you looked at the the performance table at the end of the log?
> You are wasting a large amount of runtime calculating energies every
> step and this overhead comes in multiple places in the code - one of
> them being the non-timed code parts which typically take <3%.
>
> Cheers,
> --
> Szilard
>
>
>>
>> Carmen
>>
>>
>>
>> --
>> Carmen Di Giovanni, PhD
>> Dept. of Pharmaceutical and Toxicological Chemistry
>> "Drug Discovery Lab"
>> University of Naples "Federico II"
>> Via D. Montesano, 49
>> 80131 Naples
>> Tel.: ++39 081 678623
>> Fax: ++39 081 678100
>> Email: cdigiova at unina.it
>>
>>
>>
>> Quoting Szilárd Páll <pall.szilard at gmail.com>:
>>
>>> Please keep the mails on the list.
>>>
>>> On Wed, Feb 18, 2015 at 6:32 PM, Carmen Di Giovanni <cdigiova at unina.it>
>>> wrote:
>>>>
>>>> nvidia-smi -q -g 0
>>>>
>>>> ==============NVSMI LOG==============
>>>>
>>>> Timestamp                           : Wed Feb 18 18:30:01 2015
>>>> Driver Version                      : 340.24
>>>>
>>>> Attached GPUs                       : 2
>>>> GPU 0000:03:00.0
>>>>     Product Name                    : Tesla K20c
>>>
>>> [...
>>>>
>>>>     Clocks
>>>>         Graphics                    : 705 MHz
>>>>         SM                          : 705 MHz
>>>>         Memory                      : 2600 MHz
>>>>     Applications Clocks
>>>>         Graphics                    : 705 MHz
>>>>         Memory                      : 2600 MHz
>>>>     Default Applications Clocks
>>>>         Graphics                    : 705 MHz
>>>>         Memory                      : 2600 MHz
>>>>     Max Clocks
>>>>         Graphics                    : 758 MHz
>>>>         SM                          : 758 MHz
>>>>         Memory                      : 2600 MHz
>>>
>>>
>>> This is the relevant part I was looking for. The Tesla K20c supports
>>> setting a so-called application clock which is essentially means that
>>> you can bump its clock frequency using the NVDIA management tool
>>> nvidia-smi from the default 705 MHz to 758 MHz.
>>>
>>> Use the command:
>>> nvidia-smi -ac 2600,758
>>>
>>> This should give you another 7% or so (I didn't remember the correct
>>> max clock before, that's why I guessing 5%).
>>>
>>> Cheers,
>>> Szilard
>>>
>>>>     Clock Policy
>>>>         Auto Boost                  : N/A
>>>>         Auto Boost Default          : N/A
>>>>     Compute Processes
>>>>         Process ID                  : 19441
>>>>             Name                    : gmx_mpi
>>>>             Used GPU Memory         : 110 MiB
>>>>
>>>> [carmendigi at localhost test_gpu]$
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Carmen Di Giovanni, PhD
>>>> Dept. of Pharmaceutical and Toxicological Chemistry
>>>> "Drug Discovery Lab"
>>>> University of Naples "Federico II"
>>>> Via D. Montesano, 49
>>>> 80131 Naples
>>>> Tel.: ++39 081 678623
>>>> Fax: ++39 081 678100
>>>> Email: cdigiova at unina.it
>>>>
>>>>
>>>>
>>>> Quoting Szilárd Páll <pall.szilard at gmail.com>:
>>>>
>>>>> As I suggested above please use pastebin.com or similar!
>>>>> --
>>>>> Szilárd
>>>>>
>>>>>
>>>>> On Wed, Feb 18, 2015 at 6:09 PM, Carmen Di Giovanni <cdigiova at unina.it>
>>>>> wrote:
>>>>>>
>>>>>>
>>>>>> Dear Szilàrd, it's not possible attach the full log file in the forum
>>>>>> mail
>>>>>> because it is too big.
>>>>>> I send it by your private mail address.
>>>>>> Thank you in advance
>>>>>> Carmen
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Carmen Di Giovanni, PhD
>>>>>> Dept. of Pharmaceutical and Toxicological Chemistry
>>>>>> "Drug Discovery Lab"
>>>>>> University of Naples "Federico II"
>>>>>> Via D. Montesano, 49
>>>>>> 80131 Naples
>>>>>> Tel.: ++39 081 678623
>>>>>> Fax: ++39 081 678100
>>>>>> Email: cdigiova at unina.it
>>>>>>
>>>>>>
>>>>>>
>>>>>> Quoting Szilárd Páll <pall.szilard at gmail.com>:
>>>>>>
>>>>>>> We need a *full* log file, not parts of it!
>>>>>>>
>>>>>>> You can try running with "-ntomp 16 -pin on" - it may be a bit faster
>>>>>>> not not use HyperThreading.
>>>>>>> --
>>>>>>> Szilárd
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Feb 18, 2015 at 5:20 PM, Carmen Di Giovanni
>>>>>>> <cdigiova at unina.it>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Justin,
>>>>>>>> the problem is evident for all calculations.
>>>>>>>> This is the log file  of a recent run:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --------------------------------------------------------------------------------
>>>>>>>>
>>>>>>>> Log file opened on Mon Dec 22 16:28:00 2014
>>>>>>>> Host: localhost.localdomain  pid: 8378  rank ID: 0  number of ranks:
>>>>>>>> 1
>>>>>>>> GROMACS:    gmx mdrun, VERSION 5.0
>>>>>>>>
>>>>>>>> GROMACS is written by:
>>>>>>>> Emile Apol         Rossen Apostolov   Herman J.C. Berendsen Par
>>>>>>>> Bjelkmar
>>>>>>>> Aldert van Buuren  Rudi van Drunen    Anton Feenstra     Sebastian
>>>>>>>> Fritsch
>>>>>>>> Gerrit Groenhof    Christoph Junghans Peter Kasson       Carsten
>>>>>>>> Kutzner
>>>>>>>> Per Larsson        Justin A. Lemkul   Magnus Lundborg    Pieter
>>>>>>>> Meulenhoff
>>>>>>>> Erik Marklund      Teemu Murtola      Szilard Pall       Sander Pronk
>>>>>>>> Roland Schulz      Alexey Shvetsov    Michael Shirts     Alfons
>>>>>>>> Sijbers
>>>>>>>> Peter Tieleman     Christian Wennberg Maarten Wolf
>>>>>>>> and the project leaders:
>>>>>>>> Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel
>>>>>>>>
>>>>>>>> Copyright (c) 1991-2000, University of Groningen, The Netherlands.
>>>>>>>> Copyright (c) 2001-2014, The GROMACS development team at
>>>>>>>> Uppsala University, Stockholm University and
>>>>>>>> the Royal Institute of Technology, Sweden.
>>>>>>>> check out http://www.gromacs.org for more information.
>>>>>>>>
>>>>>>>> GROMACS is free software; you can redistribute it and/or modify it
>>>>>>>> under the terms of the GNU Lesser General Public License
>>>>>>>> as published by the Free Software Foundation; either version 2.1
>>>>>>>> of the License, or (at your option) any later version.
>>>>>>>>
>>>>>>>> GROMACS:      gmx mdrun, VERSION 5.0
>>>>>>>> Executable:   /opt/SW/gromacs-5.0/build/mpi-cuda/bin/gmx_mpi
>>>>>>>> Library dir:  /opt/SW/gromacs-5.0/share/top
>>>>>>>> Command line:
>>>>>>>>   gmx_mpi mdrun -deffnm prod_20ns
>>>>>>>>
>>>>>>>> Gromacs version:    VERSION 5.0
>>>>>>>> Precision:          single
>>>>>>>> Memory model:       64 bit
>>>>>>>> MPI library:        MPI
>>>>>>>> OpenMP support:     enabled
>>>>>>>> GPU support:        enabled
>>>>>>>> invsqrt routine:    gmx_software_invsqrt(x)
>>>>>>>> SIMD instructions:  AVX_256
>>>>>>>> FFT library:        fftw-3.3.3-sse2
>>>>>>>> RDTSCP usage:       enabled
>>>>>>>> C++11 compilation:  disabled
>>>>>>>> TNG support:        enabled
>>>>>>>> Tracing support:    disabled
>>>>>>>> Built on:           Thu Jul 31 18:30:37 CEST 2014
>>>>>>>> Built by:           root at localhost.localdomain [CMAKE]
>>>>>>>> Build OS/arch:      Linux 2.6.32-431.el6.x86_64 x86_64
>>>>>>>> Build CPU vendor:   GenuineIntel
>>>>>>>> Build CPU brand:    Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
>>>>>>>> Build CPU family:   6   Model: 62   Stepping: 4
>>>>>>>> Build CPU features: aes apic avx clfsh cmov cx8 cx16 f16c htt lahf_lm
>>>>>>>> mmx
>>>>>>>> msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdrnd rdtscp
>>>>>>>> sse2
>>>>>>>> sse3
>>>>>>>> sse4.1 sse4.2 ssse3 tdt x2apic
>>>>>>>> C compiler:         /usr/bin/cc GNU 4.4.7
>>>>>>>> C compiler flags:    -mavx   -Wno-maybe-uninitialized -Wextra
>>>>>>>> -Wno-missing-field-initializers -Wno-sign-compare -Wpointer-arith
>>>>>>>> -Wall
>>>>>>>> -Wno-unused -Wunused-value -Wunused-parameter   -fomit-frame-pointer
>>>>>>>> -funroll-all-loops  -Wno-array-bounds  -O3 -DNDEBUG
>>>>>>>> C++ compiler:       /usr/bin/c++ GNU 4.4.7
>>>>>>>> C++ compiler flags:  -mavx   -Wextra -Wno-missing-field-initializers
>>>>>>>> -Wpointer-arith -Wall -Wno-unused-function   -fomit-frame-pointer
>>>>>>>> -funroll-all-loops  -Wno-array-bounds  -O3 -DNDEBUG
>>>>>>>> Boost version:      1.55.0 (internal)
>>>>>>>> CUDA compiler:      /usr/local/cuda/bin/nvcc nvcc: NVIDIA (R) Cuda
>>>>>>>> compiler
>>>>>>>> driver;Copyright (c) 2005-2013 NVIDIA Corporation;Built on
>>>>>>>> Thu_Mar_13_11:58:58_PDT_2014;Cuda compilation tools, release 6.0,
>>>>>>>> V6.0.1
>>>>>>>> CUDA compiler
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> flags:-gencode;arch=compute_20,code=sm_20;-gencode;arch=compute_20,code=sm_21;-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_35,code=compute_35;-use_fast_math;-Xcompiler;-fPIC
>>>>>>>> ;
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> ;-mavx;-Wextra;-Wno-missing-field-initializers;-Wpointer-arith;-Wall;-Wno-unused-function;-fomit-frame-pointer;-funroll-all-loops;-Wno-array-bounds;-O3;-DNDEBUG
>>>>>>>> CUDA driver:        6.50
>>>>>>>> CUDA runtime:       6.0
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
>>>>>>>> B. Hess and C. Kutzner and D. van der Spoel and E. Lindahl
>>>>>>>> GROMACS 4: Algorithms for highly efficient, load-balanced, and
>>>>>>>> scalable
>>>>>>>> molecular simulation
>>>>>>>> J. Chem. Theory Comput. 4 (2008) pp. 435-447
>>>>>>>> -------- -------- --- Thank You --- -------- --------
>>>>>>>>
>>>>>>>>
>>>>>>>> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
>>>>>>>> D. van der Spoel, E. Lindahl, B. Hess, G. Groenhof, A. E. Mark and H.
>>>>>>>> J.
>>>>>>>> C.
>>>>>>>> Berendsen
>>>>>>>> GROMACS: Fast, Flexible and Free
>>>>>>>> J. Comp. Chem. 26 (2005) pp. 1701-1719
>>>>>>>> -------- -------- --- Thank You --- -------- --------
>>>>>>>>
>>>>>>>>
>>>>>>>> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
>>>>>>>> E. Lindahl and B. Hess and D. van der Spoel
>>>>>>>> GROMACS 3.0: A package for molecular simulation and trajectory
>>>>>>>> analysis
>>>>>>>> J. Mol. Mod. 7 (2001) pp. 306-317
>>>>>>>> -------- -------- --- Thank You --- -------- --------
>>>>>>>>
>>>>>>>>
>>>>>>>> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
>>>>>>>> H. J. C. Berendsen, D. van der Spoel and R. van Drunen
>>>>>>>> GROMACS: A message-passing parallel molecular dynamics implementation
>>>>>>>> Comp. Phys. Comm. 91 (1995) pp. 43-56
>>>>>>>> -------- -------- --- Thank You --- -------- --------
>>>>>>>>
>>>>>>>>
>>>>>>>> For optimal performance with a GPU nstlist (now 10) should be larger.
>>>>>>>> The optimum depends on your CPU and GPU resources.
>>>>>>>> You might want to try several nstlist values.
>>>>>>>> Changing nstlist from 10 to 40, rlist from 1.2 to 1.285
>>>>>>>>
>>>>>>>> Input Parameters:
>>>>>>>>    integrator                     = md
>>>>>>>>    tinit                          = 0
>>>>>>>>    dt                             = 0.002
>>>>>>>>    nsteps                         = 10000000
>>>>>>>>    init-step                      = 0
>>>>>>>>    simulation-part                = 1
>>>>>>>>    comm-mode                      = Linear
>>>>>>>>    nstcomm                        = 1
>>>>>>>>    bd-fric                        = 0
>>>>>>>>    ld-seed                        = 1993
>>>>>>>>    emtol                          = 10
>>>>>>>>    emstep                         = 0.01
>>>>>>>>    niter                          = 20
>>>>>>>>    fcstep                         = 0
>>>>>>>>    nstcgsteep                     = 1000
>>>>>>>>    nbfgscorr                      = 10
>>>>>>>>    rtpi                           = 0.05
>>>>>>>>    nstxout                        = 2500
>>>>>>>>    nstvout                        = 2500
>>>>>>>>    nstfout                        = 0
>>>>>>>>    nstlog                         = 2500
>>>>>>>>    nstcalcenergy                  = 1
>>>>>>>>    nstenergy                      = 2500
>>>>>>>>    nstxout-compressed             = 500
>>>>>>>>    compressed-x-precision         = 1000
>>>>>>>>    cutoff-scheme                  = Verlet
>>>>>>>>    nstlist                        = 40
>>>>>>>>    ns-type                        = Grid
>>>>>>>>    pbc                            = xyz
>>>>>>>>    periodic-molecules             = FALSE
>>>>>>>>    verlet-buffer-tolerance        = 0.005
>>>>>>>>    rlist                          = 1.285
>>>>>>>>    rlistlong                      = 1.285
>>>>>>>>    nstcalclr                      = 10
>>>>>>>>    coulombtype                    = PME
>>>>>>>>    coulomb-modifier               = Potential-shift
>>>>>>>>    rcoulomb-switch                = 0
>>>>>>>>    rcoulomb                       = 1.2
>>>>>>>>    epsilon-r                      = 1
>>>>>>>>    epsilon-rf                     = 1
>>>>>>>>    vdw-type                       = Cut-off
>>>>>>>>    vdw-modifier                   = Potential-shift
>>>>>>>>    rvdw-switch                    = 0
>>>>>>>>    rvdw                           = 1.2
>>>>>>>>    DispCorr                       = No
>>>>>>>>    table-extension                = 1
>>>>>>>>    fourierspacing                 = 0.135
>>>>>>>>    fourier-nx                     = 128
>>>>>>>>    fourier-ny                     = 128
>>>>>>>>    fourier-nz                     = 128
>>>>>>>>    pme-order                      = 4
>>>>>>>>    ewald-rtol                     = 1e-05
>>>>>>>>    ewald-rtol-lj                  = 0.001
>>>>>>>>    lj-pme-comb-rule               = Geometric
>>>>>>>>    ewald-geometry                 = 0
>>>>>>>>    epsilon-surface                = 0
>>>>>>>>    implicit-solvent               = No
>>>>>>>>    gb-algorithm                   = Still
>>>>>>>>    nstgbradii                     = 1
>>>>>>>>    rgbradii                       = 2
>>>>>>>>    gb-epsilon-solvent             = 80
>>>>>>>>    gb-saltconc                    = 0
>>>>>>>>    gb-obc-alpha                   = 1
>>>>>>>>    gb-obc-beta                    = 0.8
>>>>>>>>    gb-obc-gamma                   = 4.85
>>>>>>>>    gb-dielectric-offset           = 0.009
>>>>>>>>    sa-algorithm                   = Ace-approximation
>>>>>>>>    sa-surface-tension             = 2.092
>>>>>>>>    tcoupl                         = V-rescale
>>>>>>>>    nsttcouple                     = 10
>>>>>>>>    nh-chain-length                = 0
>>>>>>>>    print-nose-hoover-chain-variables = FALSE
>>>>>>>>    pcoupl                         = No
>>>>>>>>    pcoupltype                     = Semiisotropic
>>>>>>>>    nstpcouple                     = -1
>>>>>>>>    tau-p                          = 0.5
>>>>>>>>    compressibility (3x3):
>>>>>>>>       compressibility[    0]={ 0.00000e+00,  0.00000e+00,
>>>>>>>> 0.00000e+00}
>>>>>>>>       compressibility[    1]={ 0.00000e+00,  0.00000e+00,
>>>>>>>> 0.00000e+00}
>>>>>>>>       compressibility[    2]={ 0.00000e+00,  0.00000e+00,
>>>>>>>> 0.00000e+00}
>>>>>>>>    ref-p (3x3):
>>>>>>>>       ref-p[    0]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
>>>>>>>>       ref-p[    1]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
>>>>>>>>       ref-p[    2]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
>>>>>>>>    refcoord-scaling               = No
>>>>>>>>    posres-com (3):
>>>>>>>>       posres-com[0]= 0.00000e+00
>>>>>>>>       posres-com[1]= 0.00000e+00
>>>>>>>>       posres-com[2]= 0.00000e+00
>>>>>>>>    posres-comB (3):
>>>>>>>>       posres-comB[0]= 0.00000e+00
>>>>>>>>       posres-comB[1]= 0.00000e+00
>>>>>>>>       posres-comB[2]= 0.00000e+00
>>>>>>>>    QMMM                           = FALSE
>>>>>>>>    QMconstraints                  = 0
>>>>>>>>    QMMMscheme                     = 0
>>>>>>>>    MMChargeScaleFactor            = 1
>>>>>>>> qm-opts:
>>>>>>>>    ngQM                           = 0
>>>>>>>>    constraint-algorithm           = Lincs
>>>>>>>>    continuation                   = FALSE
>>>>>>>>    Shake-SOR                      = FALSE
>>>>>>>>    shake-tol                      = 0.0001
>>>>>>>>    lincs-order                    = 4
>>>>>>>>    lincs-iter                     = 1
>>>>>>>>    lincs-warnangle                = 30
>>>>>>>>    nwall                          = 0
>>>>>>>>    wall-type                      = 9-3
>>>>>>>>    wall-r-linpot                  = -1
>>>>>>>>    wall-atomtype[0]               = -1
>>>>>>>>    wall-atomtype[1]               = -1
>>>>>>>>    wall-density[0]                = 0
>>>>>>>>    wall-density[1]                = 0
>>>>>>>>    wall-ewald-zfac                = 3
>>>>>>>>    pull                           = no
>>>>>>>>    rotation                       = FALSE
>>>>>>>>    interactiveMD                  = FALSE
>>>>>>>>    disre                          = No
>>>>>>>>    disre-weighting                = Conservative
>>>>>>>>    disre-mixed                    = FALSE
>>>>>>>>    dr-fc                          = 1000
>>>>>>>>    dr-tau                         = 0
>>>>>>>>    nstdisreout                    = 100
>>>>>>>>    orire-fc                       = 0
>>>>>>>>    orire-tau                      = 0
>>>>>>>>    nstorireout                    = 100
>>>>>>>>    free-energy                    = no
>>>>>>>>    cos-acceleration               = 0
>>>>>>>>    deform (3x3):
>>>>>>>>       deform[    0]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
>>>>>>>>       deform[    1]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
>>>>>>>>       deform[    2]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
>>>>>>>>    simulated-tempering            = FALSE
>>>>>>>>    E-x:
>>>>>>>>       n = 0
>>>>>>>>    E-xt:
>>>>>>>>       n = 0
>>>>>>>>    E-y:
>>>>>>>>       n = 0
>>>>>>>>    E-yt:
>>>>>>>>       n = 0
>>>>>>>>    E-z:
>>>>>>>>       n = 0
>>>>>>>>    E-zt:
>>>>>>>>       n = 0
>>>>>>>>    swapcoords                     = no
>>>>>>>>    adress                         = FALSE
>>>>>>>>    userint1                       = 0
>>>>>>>>    userint2                       = 0
>>>>>>>>    userint3                       = 0
>>>>>>>>    userint4                       = 0
>>>>>>>>    userreal1                      = 0
>>>>>>>>    userreal2                      = 0
>>>>>>>>    userreal3                      = 0
>>>>>>>>    userreal4                      = 0
>>>>>>>> grpopts:
>>>>>>>>    nrdf:      869226
>>>>>>>>    ref-t:         300
>>>>>>>>    tau-t:         0.1
>>>>>>>> annealing:          No
>>>>>>>> annealing-npoints:           0
>>>>>>>>    acc:            0           0           0
>>>>>>>>    nfreeze:           N           N           N
>>>>>>>>    energygrp-flags[  0]: 0
>>>>>>>> Using 1 MPI process
>>>>>>>> Using 32 OpenMP threads
>>>>>>>>
>>>>>>>> Detecting CPU SIMD instructions.
>>>>>>>> Present hardware specification:
>>>>>>>> Vendor: GenuineIntel
>>>>>>>> Brand:  Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
>>>>>>>> Family:  6  Model: 62  Stepping:  4
>>>>>>>> Features: aes apic avx clfsh cmov cx8 cx16 f16c htt lahf_lm mmx msr
>>>>>>>> nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdrnd rdtscp sse2
>>>>>>>> sse3
>>>>>>>> sse4.1 sse4.2 ssse3 tdt x2apic
>>>>>>>> SIMD instructions most likely to fit this hardware: AVX_256
>>>>>>>> SIMD instructions selected at GROMACS compile time: AVX_256
>>>>>>>>
>>>>>>>>
>>>>>>>> 2 GPUs detected on host localhost.localdomain:
>>>>>>>>   #0: NVIDIA Tesla K20c, compute cap.: 3.5, ECC: yes, stat:
>>>>>>>> compatible
>>>>>>>>   #1: NVIDIA GeForce GTX 650, compute cap.: 3.0, ECC:  no, stat:
>>>>>>>> compatible
>>>>>>>>
>>>>>>>> 1 GPU auto-selected for this run.
>>>>>>>> Mapping of GPU to the 1 PP rank in this node: #0
>>>>>>>>
>>>>>>>>
>>>>>>>> NOTE: potentially sub-optimal launch configuration, gmx_mpi started
>>>>>>>> with
>>>>>>>> less
>>>>>>>>       PP MPI process per node than GPUs available.
>>>>>>>>       Each PP MPI process can use only one GPU, 1 GPU per node will
>>>>>>>> be
>>>>>>>> used.
>>>>>>>>
>>>>>>>> Will do PME sum in reciprocal space for electrostatic interactions.
>>>>>>>>
>>>>>>>> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
>>>>>>>> U. Essmann, L. Perera, M. L. Berkowitz, T. Darden, H. Lee and L. G.
>>>>>>>> Pedersen
>>>>>>>> A smooth particle mesh Ewald method
>>>>>>>> J. Chem. Phys. 103 (1995) pp. 8577-8592
>>>>>>>> -------- -------- --- Thank You --- -------- --------
>>>>>>>>
>>>>>>>> Will do ordinary reciprocal space Ewald sum.
>>>>>>>> Using a Gaussian width (1/beta) of 0.384195 nm for Ewald
>>>>>>>> Cut-off's:   NS: 1.285   Coulomb: 1.2   LJ: 1.2
>>>>>>>> System total charge: -0.012
>>>>>>>> Generated table with 1142 data points for Ewald.
>>>>>>>> Tabscale = 500 points/nm
>>>>>>>> Generated table with 1142 data points for LJ6.
>>>>>>>> Tabscale = 500 points/nm
>>>>>>>> Generated table with 1142 data points for LJ12.
>>>>>>>> Tabscale = 500 points/nm
>>>>>>>> Generated table with 1142 data points for 1-4 COUL.
>>>>>>>> Tabscale = 500 points/nm
>>>>>>>> Generated table with 1142 data points for 1-4 LJ6.
>>>>>>>> Tabscale = 500 points/nm
>>>>>>>> Generated table with 1142 data points for 1-4 LJ12.
>>>>>>>> Tabscale = 500 points/nm
>>>>>>>>
>>>>>>>> Using CUDA 8x8 non-bonded kernels
>>>>>>>>
>>>>>>>> Potential shift: LJ r^-12: -1.122e-01 r^-6: -3.349e-01, Ewald
>>>>>>>> -1.000e-05
>>>>>>>> Initialized non-bonded Ewald correction tables, spacing: 7.82e-04
>>>>>>>> size:
>>>>>>>> 1536
>>>>>>>>
>>>>>>>> Removing pbc first time
>>>>>>>> Pinning threads with an auto-selected logical core stride of 1
>>>>>>>>
>>>>>>>> Initializing LINear Constraint Solver
>>>>>>>>
>>>>>>>> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
>>>>>>>> B. Hess and H. Bekker and H. J. C. Berendsen and J. G. E. M. Fraaije
>>>>>>>> LINCS: A Linear Constraint Solver for molecular simulations
>>>>>>>> J. Comp. Chem. 18 (1997) pp. 1463-1472
>>>>>>>> -------- -------- --- Thank You --- -------- --------
>>>>>>>>
>>>>>>>> The number of constraints is 5913
>>>>>>>>
>>>>>>>> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
>>>>>>>> S. Miyamoto and P. A. Kollman
>>>>>>>> SETTLE: An Analytical Version of the SHAKE and RATTLE Algorithms for
>>>>>>>> Rigid
>>>>>>>> Water Models
>>>>>>>> J. Comp. Chem. 13 (1992) pp. 952-962
>>>>>>>> -------- -------- --- Thank You --- -------- --------
>>>>>>>>
>>>>>>>> Center of mass motion removal mode is Linear
>>>>>>>> We have the following groups for center of mass motion removal:
>>>>>>>>   0:  rest
>>>>>>>>
>>>>>>>> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
>>>>>>>> G. Bussi, D. Donadio and M. Parrinello
>>>>>>>> Canonical sampling through velocity rescaling
>>>>>>>> J. Chem. Phys. 126 (2007) pp. 014101
>>>>>>>> -------- -------- --- Thank You --- -------- --------
>>>>>>>>
>>>>>>>> There are: 434658 Atoms
>>>>>>>>
>>>>>>>> Constraining the starting coordinates (step 0)
>>>>>>>>
>>>>>>>> Constraining the coordinates at t0-dt (step 0)
>>>>>>>> RMS relative constraint deviation after constraining: 3.67e-05
>>>>>>>> Initial temperature: 300.5 K
>>>>>>>>
>>>>>>>> Started mdrun on rank 0 Mon Dec 22 16:28:01 2014
>>>>>>>>            Step           Time         Lambda
>>>>>>>>               0        0.00000        0.00000
>>>>>>>>
>>>>>>>>    Energies (kJ/mol)
>>>>>>>>        G96Angle    Proper Dih.  Improper Dih.          LJ-14
>>>>>>>> Coulomb-14
>>>>>>>>     9.74139e+03    4.34956e+03    2.97359e+03   -1.93107e+02
>>>>>>>> 8.05534e+04
>>>>>>>>         LJ (SR)   Coulomb (SR)   Coul. recip.      Potential
>>>>>>>> Kinetic
>>>>>>>> En.
>>>>>>>>     1.01340e+06   -7.13271e+06    2.01361e+04   -6.00175e+06
>>>>>>>> 1.09887e+06
>>>>>>>>    Total Energy  Conserved En.    Temperature Pressure (bar)
>>>>>>>> Constr.
>>>>>>>> rmsd
>>>>>>>>    -4.90288e+06   -4.90288e+06    3.04092e+02    1.70897e+02
>>>>>>>> 2.16683e-05
>>>>>>>>
>>>>>>>> step   80: timed with pme grid 128 128 128, coulomb cutoff 1.200:
>>>>>>>> 6279.0
>>>>>>>> M-cycles
>>>>>>>> step  160: timed with pme grid 112 112 112, coulomb cutoff 1.306:
>>>>>>>> 6962.2
>>>>>>>> M-cycles
>>>>>>>> step  240: timed with pme grid 100 100 100, coulomb cutoff 1.463:
>>>>>>>> 8406.5
>>>>>>>> M-cycles
>>>>>>>> step  320: timed with pme grid 128 128 128, coulomb cutoff 1.200:
>>>>>>>> 6424.0
>>>>>>>> M-cycles
>>>>>>>> step  400: timed with pme grid 120 120 120, coulomb cutoff 1.219:
>>>>>>>> 6369.1
>>>>>>>> M-cycles
>>>>>>>> step  480: timed with pme grid 112 112 112, coulomb cutoff 1.306:
>>>>>>>> 7309.0
>>>>>>>> M-cycles
>>>>>>>> step  560: timed with pme grid 108 108 108, coulomb cutoff 1.355:
>>>>>>>> 7521.2
>>>>>>>> M-cycles
>>>>>>>> step  640: timed with pme grid 104 104 104, coulomb cutoff 1.407:
>>>>>>>> 8369.8
>>>>>>>> M-cycles
>>>>>>>>               optimal pme grid 128 128 128, coulomb cutoff 1.200
>>>>>>>>            Step           Time         Lambda
>>>>>>>>            2500        5.00000        0.00000
>>>>>>>>
>>>>>>>>    Energies (kJ/mol)
>>>>>>>>        G96Angle    Proper Dih.  Improper Dih.          LJ-14
>>>>>>>> Coulomb-14
>>>>>>>>     9.72545e+03    4.33046e+03    2.98087e+03   -1.95794e+02
>>>>>>>> 8.05967e+04
>>>>>>>>         LJ (SR)   Coulomb (SR)   Coul. recip.      Potential
>>>>>>>> Kinetic
>>>>>>>> En.
>>>>>>>>     1.01293e+06   -7.13110e+06    2.01689e+04   -6.00057e+06
>>>>>>>> 1.08489e+06
>>>>>>>>    Total Energy  Conserved En.    Temperature Pressure (bar)
>>>>>>>> Constr.
>>>>>>>> rmsd
>>>>>>>>    -4.91567e+06   -4.90300e+06    3.00225e+02    1.36173e+02
>>>>>>>> 2.25998e-05
>>>>>>>>
>>>>>>>>            Step           Time         Lambda
>>>>>>>>            5000       10.00000        0.00000
>>>>>>>>
>>>>>>>> ............
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> -------------------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> Thank you in advance
>>>>>>>>
>>>>>>>> --
>>>>>>>> Carmen Di Giovanni, PhD
>>>>>>>> Dept. of Pharmaceutical and Toxicological Chemistry
>>>>>>>> "Drug Discovery Lab"
>>>>>>>> University of Naples "Federico II"
>>>>>>>> Via D. Montesano, 49
>>>>>>>> 80131 Naples
>>>>>>>> Tel.: ++39 081 678623
>>>>>>>> Fax: ++39 081 678100
>>>>>>>> Email: cdigiova at unina.it
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Quoting Justin Lemkul <jalemkul at vt.edu>:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2/18/15 11:09 AM, Barnett, James W wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> What's your exact command?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> A full .log file would be even better; it would tell us everything
>>>>>>>>> we
>>>>>>>>> need
>>>>>>>>> to know :)
>>>>>>>>>
>>>>>>>>> -Justin
>>>>>>>>>
>>>>>>>>>> Have you reviewed this page:
>>>>>>>>>>
>>>>>>>>>> http://www.gromacs.org/Documentation/Acceleration_and_parallelization
>>>>>>>>>>
>>>>>>>>>> James "Wes" Barnett
>>>>>>>>>> Ph.D. Candidate
>>>>>>>>>> Chemical and Biomolecular Engineering
>>>>>>>>>>
>>>>>>>>>> Tulane University
>>>>>>>>>> Boggs Center for Energy and Biotechnology, Room 341-B
>>>>>>>>>>
>>>>>>>>>> ________________________________________
>>>>>>>>>> From: gromacs.org_gmx-users-bounces at maillist.sys.kth.se
>>>>>>>>>> <gromacs.org_gmx-users-bounces at maillist.sys.kth.se> on behalf of
>>>>>>>>>> Carmen
>>>>>>>>>> Di
>>>>>>>>>> Giovanni <cdigiova at unina.it>
>>>>>>>>>> Sent: Wednesday, February 18, 2015 10:06 AM
>>>>>>>>>> To: gromacs.org_gmx-users at maillist.sys.kth.se
>>>>>>>>>> Subject: Re: [gmx-users] GPU low performance
>>>>>>>>>>
>>>>>>>>>> I post the message of a md run :
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Force evaluation time GPU/CPU: 40.974 ms/24.437 ms = 1.677
>>>>>>>>>> For optimal performance this ratio should be close to 1!
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> NOTE: The GPU has >20% more load than the CPU. This imbalance
>>>>>>>>>> causes
>>>>>>>>>>        performance loss, consider using a shorter cut-off and a
>>>>>>>>>> finer
>>>>>>>>>> PME
>>>>>>>>>> grid.
>>>>>>>>>>
>>>>>>>>>> As can I solved this problem ?
>>>>>>>>>> Thank you in advance
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Carmen Di Giovanni, PhD
>>>>>>>>>> Dept. of Pharmaceutical and Toxicological Chemistry
>>>>>>>>>> "Drug Discovery Lab"
>>>>>>>>>> University of Naples "Federico II"
>>>>>>>>>> Via D. Montesano, 49
>>>>>>>>>> 80131 Naples
>>>>>>>>>> Tel.: ++39 081 678623
>>>>>>>>>> Fax: ++39 081 678100
>>>>>>>>>> Email: cdigiova at unina.it
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Quoting Justin Lemkul <jalemkul at vt.edu>:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 2/18/15 10:30 AM, Carmen Di Giovanni wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Daear all,
>>>>>>>>>>>> I'm working on a machine with an INVIDIA Teska K20.
>>>>>>>>>>>> After a minimization on a protein of 1925 atoms this is the
>>>>>>>>>>>> mesage:
>>>>>>>>>>>>
>>>>>>>>>>>> Force evaluation time GPU/CPU: 2.923 ms/116.774 ms = 0.025
>>>>>>>>>>>> For optimal performance this ratio should be close to 1!
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Minimization is a poor indicator of performance.  Do a real MD
>>>>>>>>>>> run.
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> NOTE: The GPU has >25% less load than the CPU. This imbalance
>>>>>>>>>>>> causes
>>>>>>>>>>>> performance loss.
>>>>>>>>>>>>
>>>>>>>>>>>> Core t (s) Wall t (s) (%)
>>>>>>>>>>>> Time: 3289.010 205.891 1597.4
>>>>>>>>>>>> (steps/hour)
>>>>>>>>>>>> Performance: 8480.2
>>>>>>>>>>>> Finished mdrun on rank 0 Wed Feb 18 15:50:06 2015
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Cai I improve the performance?
>>>>>>>>>>>> At the moment in the forum I didn't full informations to solve
>>>>>>>>>>>> this
>>>>>>>>>>>> problem.
>>>>>>>>>>>> In attachment there is the log. file
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> The list does not accept attachments.  If you wish to share a
>>>>>>>>>>> file,
>>>>>>>>>>> upload it to a file-sharing service and provide a URL.  The full
>>>>>>>>>>> .log is quite important for understanding your hardware,
>>>>>>>>>>> optimizations, and seeing full details of the performance
>>>>>>>>>>> breakdown.
>>>>>>>>>>>  But again, base your assessment on MD, not EM.
>>>>>>>>>>>
>>>>>>>>>>> -Justin
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> ==================================================
>>>>>>>>>>>
>>>>>>>>>>> Justin A. Lemkul, Ph.D.
>>>>>>>>>>> Ruth L. Kirschstein NRSA Postdoctoral Fellow
>>>>>>>>>>>
>>>>>>>>>>> Department of Pharmaceutical Sciences
>>>>>>>>>>> School of Pharmacy
>>>>>>>>>>> Health Sciences Facility II, Room 629
>>>>>>>>>>> University of Maryland, Baltimore
>>>>>>>>>>> 20 Penn St.
>>>>>>>>>>> Baltimore, MD 21201
>>>>>>>>>>>
>>>>>>>>>>> jalemkul at outerbanks.umaryland.edu | (410) 706-7441
>>>>>>>>>>> http://mackerell.umaryland.edu/~jalemkul
>>>>>>>>>>>
>>>>>>>>>>> ==================================================
>>>>>>>>>>> --
>>>>>>>>>>> Gromacs Users mailing list
>>>>>>>>>>>
>>>>>>>>>>> * Please search the archive at
>>>>>>>>>>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>>>>>>>>>>> posting!
>>>>>>>>>>>
>>>>>>>>>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>>>>>>>>>
>>>>>>>>>>> * For (un)subscribe requests visit
>>>>>>>>>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users
>>>>>>>>>>> or send a mail to gmx-users-request at gromacs.org.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Gromacs Users mailing list
>>>>>>>>>>
>>>>>>>>>> * Please search the archive at
>>>>>>>>>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>>>>>>>>>> posting!
>>>>>>>>>>
>>>>>>>>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>>>>>>>>
>>>>>>>>>> * For (un)subscribe requests visit
>>>>>>>>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users
>>>>>>>>>> or
>>>>>>>>>> send a mail to gmx-users-request at gromacs.org.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> ==================================================
>>>>>>>>>
>>>>>>>>> Justin A. Lemkul, Ph.D.
>>>>>>>>> Ruth L. Kirschstein NRSA Postdoctoral Fellow
>>>>>>>>>
>>>>>>>>> Department of Pharmaceutical Sciences
>>>>>>>>> School of Pharmacy
>>>>>>>>> Health Sciences Facility II, Room 629
>>>>>>>>> University of Maryland, Baltimore
>>>>>>>>> 20 Penn St.
>>>>>>>>> Baltimore, MD 21201
>>>>>>>>>
>>>>>>>>> jalemkul at outerbanks.umaryland.edu | (410) 706-7441
>>>>>>>>> http://mackerell.umaryland.edu/~jalemkul
>>>>>>>>>
>>>>>>>>> ==================================================
>>>>>>>>> --
>>>>>>>>> Gromacs Users mailing list
>>>>>>>>>
>>>>>>>>> * Please search the archive at
>>>>>>>>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>>>>>>>>> posting!
>>>>>>>>>
>>>>>>>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>>>>>>>
>>>>>>>>> * For (un)subscribe requests visit
>>>>>>>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users
>>>>>>>>> or
>>>>>>>>> send
>>>>>>>>> a mail to gmx-users-request at gromacs.org.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Gromacs Users mailing list
>>>>>>>>
>>>>>>>> * Please search the archive at
>>>>>>>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>>>>>>>> posting!
>>>>>>>>
>>>>>>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>>>>>>
>>>>>>>> * For (un)subscribe requests visit
>>>>>>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>>>>>>>> send a
>>>>>>>> mail to gmx-users-request at gromacs.org.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Gromacs Users mailing list
>>>>>>>
>>>>>>> * Please search the archive at
>>>>>>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>>>>>>> posting!
>>>>>>>
>>>>>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>>>>>
>>>>>>> * For (un)subscribe requests visit
>>>>>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>>>>>>> send
>>>>>>> a mail to gmx-users-request at gromacs.org.
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>
>