[gmx-users] strange GPU load distribution
Mark Abraham
mark.j.abraham at gmail.com
Fri Apr 27 21:52:20 CEST 2018
Group cutoff scheme can never run on a gpu, so none of that should matter.
Use ps and find out what the command lines were.
Mark
On Fri, Apr 27, 2018, 21:37 Alex <nedomacho at gmail.com> wrote:
> Update: we're basically removing commands one by one from the script that
> submits the jobs causing the issue. The culprit is both EM and the MD run:
> and GPUs are being affected _before_ MD starts loading the CPU, i.e. this
> is the initial setting up of the EM run -- CPU load is near zero,
> nvidia-smi reports the mess. I wonder if this is in any way related to that
> timing test we were failing a while back.
> mdrun call and mdp below, though I suspect they have nothing to do with
> what is happening. Any help will be very highly appreciated.
>
> Alex
>
> ***
>
> gmx mdrun -nt 2 -nb cpu -pme cpu -deffnm
>
> mdp:
>
> ; Run control
> integrator = md-vv ; Velocity Verlet
> tinit = 0
> dt = 0.002
> nsteps = 500000 ; 1 ns
> nstcomm = 100
> ; Output control
> nstxout = 50000
> nstvout = 50000
> nstfout = 0
> nstlog = 50000
> nstenergy = 50000
> nstxout-compressed = 0
> ; Neighborsearching and short-range nonbonded interactions
> cutoff-scheme = group
> nstlist = 10
> ns_type = grid
> pbc = xyz
> rlist = 1.4
> ; Electrostatics
> coulombtype = cutoff
> rcoulomb = 1.4
> ; van der Waals
> vdwtype = user
> vdw-modifier = none
> rvdw = 1.4
> ; Apply long range dispersion corrections for Energy and Pressure
> DispCorr = EnerPres
> ; Spacing for the PME/PPPM FFT grid
> fourierspacing = 0.12
> ; EWALD/PME/PPPM parameters
> pme_order = 6
> ewald_rtol = 1e-06
> epsilon_surface = 0
> ; Temperature coupling
> Tcoupl = nose-hoover
> tc_grps = system
> tau_t = 1.0
> ref_t = some_temperature
> ; Pressure coupling is off for NVT
> Pcoupl = No
> tau_p = 0.5
> compressibility = 4.5e-05
> ref_p = 1.0
> ; options for bonds
> constraints = all-bonds
> constraint_algorithm = lincs
>
>
>
>
>
>
> On Fri, Apr 27, 2018 at 1:14 PM, Alex <nedomacho at gmail.com> wrote:
>
> > As I said, only two users, and nvidia-smi shows the process name. We're
> > investigating and it does appear that it is EM that uses cutoff
> > electrostatics and as a result the user did not bother with -pme cpu in
> the
> > mdrun call. What would be the correct way to enforce cpu-only mdrun when
> > coulombtype = cutoff?
> >
> > Thanks,
> >
> > Alex
> >
> > On Fri, Apr 27, 2018 at 12:45 PM, Mark Abraham <mark.j.abraham at gmail.com
> >
> > wrote:
> >
> >> No.
> >>
> >> Look at the processes that are running, e.g. with top or ps. Either old
> >> simulations or another user is running.
> >>
> >> Mark
> >>
> >> On Fri, Apr 27, 2018, 20:33 Alex <nedomacho at gmail.com> wrote:
> >>
> >> > Strange. There are only two people using this machine, myself being
> one
> >> of
> >> > them, and the other person specifically forces -nb cpu -pme cpu in his
> >> > calls to mdrun. Are any other GMX utilities (e.g. insert-molecules,
> >> grompp,
> >> > or energy) trying to use GPUs?
> >> >
> >> > Thanks,
> >> >
> >> > Alex
> >> >
> >> > On Fri, Apr 27, 2018 at 5:33 AM, Szilárd Páll <pall.szilard at gmail.com
> >
> >> > wrote:
> >> >
> >> > > The second column is PIDs so there is a whole lot more going on
> there
> >> > than
> >> > > just a single simulation, single rank using two GPUs. That would be
> >> one
> >> > PID
> >> > > and two entries for the two GPUs. Are you sure you're not running
> >> other
> >> > > processes?
> >> > >
> >> > > --
> >> > > Szilárd
> >> > >
> >> > > On Thu, Apr 26, 2018 at 5:52 AM, Alex <nedomacho at gmail.com> wrote:
> >> > >
> >> > > > Hi all,
> >> > > >
> >> > > > I am running GMX 2018 with gmx mdrun -pinoffset 0 -pin on -nt 24
> >> > -ntmpi 4
> >> > > > -npme 1 -pme gpu -nb gpu -gputasks 1122
> >> > > >
> >> > > > Once in a while the simulation slows down and nvidia-smi reports
> >> > > something
> >> > > > like this:
> >> > > >
> >> > > > | 1 12981 C gmx
> >> > > > 175MiB |
> >> > > > | 2 12981 C gmx
> >> > > > 217MiB |
> >> > > > | 2 13083 C gmx
> >> > > > 161MiB |
> >> > > > | 2 13086 C gmx
> >> > > > 159MiB |
> >> > > > | 2 13089 C gmx
> >> > > > 139MiB |
> >> > > > | 2 13093 C gmx
> >> > > > 163MiB |
> >> > > > | 2 13096 C gmx
> >> > > > 11MiB |
> >> > > > | 2 13099 C gmx
> >> > > > 8MiB |
> >> > > > | 2 13102 C gmx
> >> > > > 8MiB |
> >> > > > | 2 13106 C gmx
> >> > > > 8MiB |
> >> > > > | 2 13109 C gmx
> >> > > > 8MiB |
> >> > > > | 2 13112 C gmx
> >> > > > 8MiB |
> >> > > > | 2 13115 C gmx
> >> > > > 8MiB |
> >> > > > | 2 13119 C gmx
> >> > > > 8MiB |
> >> > > > | 2 13122 C gmx
> >> > > > 8MiB |
> >> > > > | 2 13125 C gmx
> >> > > > 8MiB |
> >> > > > | 2 13128 C gmx
> >> > > > 8MiB |
> >> > > > | 2 13131 C gmx
> >> > > > 8MiB |
> >> > > > | 2 13134 C gmx
> >> > > > 8MiB |
> >> > > > | 2 13138 C gmx
> >> > > > 8MiB |
> >> > > > | 2 13141 C gmx
> >> > > > 8MiB |
> >> > > > +-----------------------------------------------------------
> >> > > > ------------------+
> >> > > >
> >> > > > Then goes back to the expected load. Is this normal?
> >> > > >
> >> > > > Thanks,
> >> > > >
> >> > > > Alex
> >> > > >
> >> > > > --
> >> > > > Gromacs Users mailing list
> >> > > >
> >> > > > * Please search the archive at http://www.gromacs.org/Support
> >> > > > /Mailing_Lists/GMX-Users_List before posting!
> >> > > >
> >> > > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >> > > >
> >> > > > * For (un)subscribe requests visit
> >> > > >
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users
> >> or
> >> > > > send a mail to gmx-users-request at gromacs.org.
> >> > > --
> >> > > Gromacs Users mailing list
> >> > >
> >> > > * Please search the archive at http://www.gromacs.org/
> >> > > Support/Mailing_Lists/GMX-Users_List before posting!
> >> > >
> >> > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >> > >
> >> > > * For (un)subscribe requests visit
> >> > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users
> or
> >> > > send a mail to gmx-users-request at gromacs.org.
> >> > --
> >> > Gromacs Users mailing list
> >> >
> >> > * Please search the archive at
> >> > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> >> > posting!
> >> >
> >> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >> >
> >> > * For (un)subscribe requests visit
> >> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> >> > send a mail to gmx-users-request at gromacs.org.
> >> --
> >> Gromacs Users mailing list
> >>
> >> * Please search the archive at http://www.gromacs.org/Support
> >> /Mailing_Lists/GMX-Users_List before posting!
> >>
> >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >>
> >> * For (un)subscribe requests visit
> >> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> >> send a mail to gmx-users-request at gromacs.org.
> >
> >
> >
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
More information about the gromacs.org_gmx-users
mailing list