[gmx-users] strange GPU load distribution

Alex nedomacho at gmail.com
Fri Apr 27 21:58:54 CEST 2018


Mark, I copied the exact command line from the script, right above the 
mdp file. It is literally how the script calls mdrun in this case:

gmx mdrun -nt 2 -nb cpu -pme cpu -deffnm


On 4/27/2018 1:52 PM, Mark Abraham wrote:
> Group cutoff scheme can never run on a gpu, so none of that should matter.
> Use ps and find out what the command lines were.
>
> Mark
>
> On Fri, Apr 27, 2018, 21:37 Alex <nedomacho at gmail.com> wrote:
>
>> Update: we're basically removing commands one by one from the script that
>> submits the jobs causing the issue. The culprit is both EM and the MD run:
>> and GPUs are being affected _before_ MD starts loading the CPU, i.e. this
>> is the initial setting up of the EM run -- CPU load is near zero,
>> nvidia-smi reports the mess. I wonder if this is in any way related to that
>> timing test we were failing a while back.
>> mdrun call and mdp below, though I suspect they have nothing to do with
>> what is happening. Any help will be very highly appreciated.
>>
>> Alex
>>
>> ***
>>
>> gmx mdrun -nt 2 -nb cpu -pme cpu -deffnm
>>
>> mdp:
>>
>> ; Run control
>> integrator               = md-vv       ; Velocity Verlet
>> tinit                    = 0
>> dt                       = 0.002
>> nsteps                   = 500000    ; 1 ns
>> nstcomm                  = 100
>> ; Output control
>> nstxout                  = 50000
>> nstvout                  = 50000
>> nstfout                  = 0
>> nstlog                   = 50000
>> nstenergy                = 50000
>> nstxout-compressed       = 0
>> ; Neighborsearching and short-range nonbonded interactions
>> cutoff-scheme            = group
>> nstlist                  = 10
>> ns_type                  = grid
>> pbc                      = xyz
>> rlist                    = 1.4
>> ; Electrostatics
>> coulombtype              = cutoff
>> rcoulomb                 = 1.4
>> ; van der Waals
>> vdwtype                  = user
>> vdw-modifier             = none
>> rvdw                     = 1.4
>> ; Apply long range dispersion corrections for Energy and Pressure
>> DispCorr                  = EnerPres
>> ; Spacing for the PME/PPPM FFT grid
>> fourierspacing           = 0.12
>> ; EWALD/PME/PPPM parameters
>> pme_order                = 6
>> ewald_rtol               = 1e-06
>> epsilon_surface          = 0
>> ; Temperature coupling
>> Tcoupl                   = nose-hoover
>> tc_grps                  = system
>> tau_t                    = 1.0
>> ref_t                    = some_temperature
>> ; Pressure coupling is off for NVT
>> Pcoupl                   = No
>> tau_p                    = 0.5
>> compressibility          = 4.5e-05
>> ref_p                    = 1.0
>> ; options for bonds
>> constraints              = all-bonds
>> constraint_algorithm     = lincs
>>
>>
>>
>>
>>
>>
>> On Fri, Apr 27, 2018 at 1:14 PM, Alex <nedomacho at gmail.com> wrote:
>>
>>> As I said, only two users, and nvidia-smi shows the process name. We're
>>> investigating and it does appear that it is EM that uses cutoff
>>> electrostatics and as a result the user did not bother with -pme cpu in
>> the
>>> mdrun call. What would be the correct way to enforce cpu-only mdrun when
>>> coulombtype = cutoff?
>>>
>>> Thanks,
>>>
>>> Alex
>>>
>>> On Fri, Apr 27, 2018 at 12:45 PM, Mark Abraham <mark.j.abraham at gmail.com
>>>
>>> wrote:
>>>
>>>> No.
>>>>
>>>> Look at the processes that are running, e.g. with top or ps. Either old
>>>> simulations or another user is running.
>>>>
>>>> Mark
>>>>
>>>> On Fri, Apr 27, 2018, 20:33 Alex <nedomacho at gmail.com> wrote:
>>>>
>>>>> Strange. There are only two people using this machine, myself being
>> one
>>>> of
>>>>> them, and the other person specifically forces -nb cpu -pme cpu in his
>>>>> calls to mdrun. Are any other GMX utilities (e.g. insert-molecules,
>>>> grompp,
>>>>> or energy) trying to use GPUs?
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Alex
>>>>>
>>>>> On Fri, Apr 27, 2018 at 5:33 AM, Szilárd Páll <pall.szilard at gmail.com
>>>>> wrote:
>>>>>
>>>>>> The second column is PIDs so there is a whole lot more going on
>> there
>>>>> than
>>>>>> just a single simulation, single rank using two GPUs. That would be
>>>> one
>>>>> PID
>>>>>> and two entries for the two GPUs. Are you sure you're not running
>>>> other
>>>>>> processes?
>>>>>>
>>>>>> --
>>>>>> Szilárd
>>>>>>
>>>>>> On Thu, Apr 26, 2018 at 5:52 AM, Alex <nedomacho at gmail.com> wrote:
>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> I am running GMX 2018 with gmx mdrun -pinoffset 0 -pin on -nt 24
>>>>> -ntmpi 4
>>>>>>> -npme 1 -pme gpu -nb gpu -gputasks 1122
>>>>>>>
>>>>>>> Once in a while the simulation slows down and nvidia-smi reports
>>>>>> something
>>>>>>> like this:
>>>>>>>
>>>>>>> |    1     12981      C gmx
>>>>>>> 175MiB |
>>>>>>> |    2     12981      C gmx
>>>>>>> 217MiB |
>>>>>>> |    2     13083      C gmx
>>>>>>> 161MiB |
>>>>>>> |    2     13086      C gmx
>>>>>>> 159MiB |
>>>>>>> |    2     13089      C gmx
>>>>>>> 139MiB |
>>>>>>> |    2     13093      C gmx
>>>>>>> 163MiB |
>>>>>>> |    2     13096      C gmx
>>>>>>> 11MiB |
>>>>>>> |    2     13099      C gmx
>>>>>>> 8MiB |
>>>>>>> |    2     13102      C gmx
>>>>>>> 8MiB |
>>>>>>> |    2     13106      C gmx
>>>>>>> 8MiB |
>>>>>>> |    2     13109      C gmx
>>>>>>> 8MiB |
>>>>>>> |    2     13112      C gmx
>>>>>>> 8MiB |
>>>>>>> |    2     13115      C gmx
>>>>>>> 8MiB |
>>>>>>> |    2     13119      C gmx
>>>>>>> 8MiB |
>>>>>>> |    2     13122      C gmx
>>>>>>> 8MiB |
>>>>>>> |    2     13125      C gmx
>>>>>>> 8MiB |
>>>>>>> |    2     13128      C gmx
>>>>>>> 8MiB |
>>>>>>> |    2     13131      C gmx
>>>>>>> 8MiB |
>>>>>>> |    2     13134      C gmx
>>>>>>> 8MiB |
>>>>>>> |    2     13138      C gmx
>>>>>>> 8MiB |
>>>>>>> |    2     13141      C gmx
>>>>>>> 8MiB |
>>>>>>> +-----------------------------------------------------------
>>>>>>> ------------------+
>>>>>>>
>>>>>>> Then goes back to the expected load. Is this normal?
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Alex
>>>>>>>
>>>>>>> --
>>>>>>> Gromacs Users mailing list
>>>>>>>
>>>>>>> * Please search the archive at http://www.gromacs.org/Support
>>>>>>> /Mailing_Lists/GMX-Users_List before posting!
>>>>>>>
>>>>>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>>>>>
>>>>>>> * For (un)subscribe requests visit
>>>>>>>
>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users
>>>> or
>>>>>>> send a mail to gmx-users-request at gromacs.org.
>>>>>> --
>>>>>> Gromacs Users mailing list
>>>>>>
>>>>>> * Please search the archive at http://www.gromacs.org/
>>>>>> Support/Mailing_Lists/GMX-Users_List before posting!
>>>>>>
>>>>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>>>>
>>>>>> * For (un)subscribe requests visit
>>>>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users
>> or
>>>>>> send a mail to gmx-users-request at gromacs.org.
>>>>> --
>>>>> Gromacs Users mailing list
>>>>>
>>>>> * Please search the archive at
>>>>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>>>>> posting!
>>>>>
>>>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>>>
>>>>> * For (un)subscribe requests visit
>>>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>>>>> send a mail to gmx-users-request at gromacs.org.
>>>> --
>>>> Gromacs Users mailing list
>>>>
>>>> * Please search the archive at http://www.gromacs.org/Support
>>>> /Mailing_Lists/GMX-Users_List before posting!
>>>>
>>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>>
>>>> * For (un)subscribe requests visit
>>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>>>> send a mail to gmx-users-request at gromacs.org.
>>>
>>>
>> --
>> Gromacs Users mailing list
>>
>> * Please search the archive at
>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>> posting!
>>
>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>
>> * For (un)subscribe requests visit
>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> send a mail to gmx-users-request at gromacs.org.



More information about the gromacs.org_gmx-users mailing list