[gmx-users] strange GPU load distribution

Alex nedomacho at gmail.com
Fri Apr 27 21:36:51 CEST 2018


Update: we're basically removing commands one by one from the script that
submits the jobs causing the issue. The culprit is both EM and the MD run:
and GPUs are being affected _before_ MD starts loading the CPU, i.e. this
is the initial setting up of the EM run -- CPU load is near zero,
nvidia-smi reports the mess. I wonder if this is in any way related to that
timing test we were failing a while back.
mdrun call and mdp below, though I suspect they have nothing to do with
what is happening. Any help will be very highly appreciated.

Alex

***

gmx mdrun -nt 2 -nb cpu -pme cpu -deffnm

mdp:

; Run control
integrator               = md-vv       ; Velocity Verlet
tinit                    = 0
dt                       = 0.002
nsteps                   = 500000    ; 1 ns
nstcomm                  = 100
; Output control
nstxout                  = 50000
nstvout                  = 50000
nstfout                  = 0
nstlog                   = 50000
nstenergy                = 50000
nstxout-compressed       = 0
; Neighborsearching and short-range nonbonded interactions
cutoff-scheme            = group
nstlist                  = 10
ns_type                  = grid
pbc                      = xyz
rlist                    = 1.4
; Electrostatics
coulombtype              = cutoff
rcoulomb                 = 1.4
; van der Waals
vdwtype                  = user
vdw-modifier             = none
rvdw                     = 1.4
; Apply long range dispersion corrections for Energy and Pressure
DispCorr                  = EnerPres
; Spacing for the PME/PPPM FFT grid
fourierspacing           = 0.12
; EWALD/PME/PPPM parameters
pme_order                = 6
ewald_rtol               = 1e-06
epsilon_surface          = 0
; Temperature coupling
Tcoupl                   = nose-hoover
tc_grps                  = system
tau_t                    = 1.0
ref_t                    = some_temperature
; Pressure coupling is off for NVT
Pcoupl                   = No
tau_p                    = 0.5
compressibility          = 4.5e-05
ref_p                    = 1.0
; options for bonds
constraints              = all-bonds
constraint_algorithm     = lincs






On Fri, Apr 27, 2018 at 1:14 PM, Alex <nedomacho at gmail.com> wrote:

> As I said, only two users, and nvidia-smi shows the process name. We're
> investigating and it does appear that it is EM that uses cutoff
> electrostatics and as a result the user did not bother with -pme cpu in the
> mdrun call. What would be the correct way to enforce cpu-only mdrun when
> coulombtype = cutoff?
>
> Thanks,
>
> Alex
>
> On Fri, Apr 27, 2018 at 12:45 PM, Mark Abraham <mark.j.abraham at gmail.com>
> wrote:
>
>> No.
>>
>> Look at the processes that are running, e.g. with top or ps. Either old
>> simulations or another user is running.
>>
>> Mark
>>
>> On Fri, Apr 27, 2018, 20:33 Alex <nedomacho at gmail.com> wrote:
>>
>> > Strange. There are only two people using this machine, myself being one
>> of
>> > them, and the other person specifically forces -nb cpu -pme cpu in his
>> > calls to mdrun. Are any other GMX utilities (e.g. insert-molecules,
>> grompp,
>> > or energy) trying to use GPUs?
>> >
>> > Thanks,
>> >
>> > Alex
>> >
>> > On Fri, Apr 27, 2018 at 5:33 AM, Szilárd Páll <pall.szilard at gmail.com>
>> > wrote:
>> >
>> > > The second column is PIDs so there is a whole lot more going on there
>> > than
>> > > just a single simulation, single rank using two GPUs. That would be
>> one
>> > PID
>> > > and two entries for the two GPUs. Are you sure you're not running
>> other
>> > > processes?
>> > >
>> > > --
>> > > Szilárd
>> > >
>> > > On Thu, Apr 26, 2018 at 5:52 AM, Alex <nedomacho at gmail.com> wrote:
>> > >
>> > > > Hi all,
>> > > >
>> > > > I am running GMX 2018 with gmx mdrun -pinoffset 0 -pin on -nt 24
>> > -ntmpi 4
>> > > > -npme 1 -pme gpu -nb gpu -gputasks 1122
>> > > >
>> > > > Once in a while the simulation slows down and nvidia-smi reports
>> > > something
>> > > > like this:
>> > > >
>> > > > |    1     12981      C gmx
>> > > > 175MiB |
>> > > > |    2     12981      C gmx
>> > > > 217MiB |
>> > > > |    2     13083      C gmx
>> > > > 161MiB |
>> > > > |    2     13086      C gmx
>> > > > 159MiB |
>> > > > |    2     13089      C gmx
>> > > > 139MiB |
>> > > > |    2     13093      C gmx
>> > > > 163MiB |
>> > > > |    2     13096      C gmx
>> > > > 11MiB |
>> > > > |    2     13099      C gmx
>> > > > 8MiB |
>> > > > |    2     13102      C gmx
>> > > > 8MiB |
>> > > > |    2     13106      C gmx
>> > > > 8MiB |
>> > > > |    2     13109      C gmx
>> > > > 8MiB |
>> > > > |    2     13112      C gmx
>> > > > 8MiB |
>> > > > |    2     13115      C gmx
>> > > > 8MiB |
>> > > > |    2     13119      C gmx
>> > > > 8MiB |
>> > > > |    2     13122      C gmx
>> > > > 8MiB |
>> > > > |    2     13125      C gmx
>> > > > 8MiB |
>> > > > |    2     13128      C gmx
>> > > > 8MiB |
>> > > > |    2     13131      C gmx
>> > > > 8MiB |
>> > > > |    2     13134      C gmx
>> > > > 8MiB |
>> > > > |    2     13138      C gmx
>> > > > 8MiB |
>> > > > |    2     13141      C gmx
>> > > > 8MiB |
>> > > > +-----------------------------------------------------------
>> > > > ------------------+
>> > > >
>> > > > Then goes back to the expected load. Is this normal?
>> > > >
>> > > > Thanks,
>> > > >
>> > > > Alex
>> > > >
>> > > > --
>> > > > Gromacs Users mailing list
>> > > >
>> > > > * Please search the archive at http://www.gromacs.org/Support
>> > > > /Mailing_Lists/GMX-Users_List before posting!
>> > > >
>> > > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>> > > >
>> > > > * For (un)subscribe requests visit
>> > > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users
>> or
>> > > > send a mail to gmx-users-request at gromacs.org.
>> > > --
>> > > Gromacs Users mailing list
>> > >
>> > > * Please search the archive at http://www.gromacs.org/
>> > > Support/Mailing_Lists/GMX-Users_List before posting!
>> > >
>> > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>> > >
>> > > * For (un)subscribe requests visit
>> > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> > > send a mail to gmx-users-request at gromacs.org.
>> > --
>> > Gromacs Users mailing list
>> >
>> > * Please search the archive at
>> > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>> > posting!
>> >
>> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>> >
>> > * For (un)subscribe requests visit
>> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> > send a mail to gmx-users-request at gromacs.org.
>> --
>> Gromacs Users mailing list
>>
>> * Please search the archive at http://www.gromacs.org/Support
>> /Mailing_Lists/GMX-Users_List before posting!
>>
>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>
>> * For (un)subscribe requests visit
>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> send a mail to gmx-users-request at gromacs.org.
>
>
>


More information about the gromacs.org_gmx-users mailing list