[gmx-users] strange GPU load distribution
Alex
nedomacho at gmail.com
Fri Apr 27 22:18:06 CEST 2018
I see. :) I will check again when I am back at work.
Thanks!
Alex
On 4/27/2018 2:16 PM, Mark Abraham wrote:
> Hi,
>
> What you think was run isn't nearly as useful when troubleshooting as
> asking the kernel what is actually running.
>
> Mark
>
>
> On Fri, Apr 27, 2018, 21:59 Alex <nedomacho at gmail.com> wrote:
>
>> Mark, I copied the exact command line from the script, right above the
>> mdp file. It is literally how the script calls mdrun in this case:
>>
>> gmx mdrun -nt 2 -nb cpu -pme cpu -deffnm
>>
>>
>> On 4/27/2018 1:52 PM, Mark Abraham wrote:
>>> Group cutoff scheme can never run on a gpu, so none of that should
>> matter.
>>> Use ps and find out what the command lines were.
>>>
>>> Mark
>>>
>>> On Fri, Apr 27, 2018, 21:37 Alex <nedomacho at gmail.com> wrote:
>>>
>>>> Update: we're basically removing commands one by one from the script
>> that
>>>> submits the jobs causing the issue. The culprit is both EM and the MD
>> run:
>>>> and GPUs are being affected _before_ MD starts loading the CPU, i.e.
>> this
>>>> is the initial setting up of the EM run -- CPU load is near zero,
>>>> nvidia-smi reports the mess. I wonder if this is in any way related to
>> that
>>>> timing test we were failing a while back.
>>>> mdrun call and mdp below, though I suspect they have nothing to do with
>>>> what is happening. Any help will be very highly appreciated.
>>>>
>>>> Alex
>>>>
>>>> ***
>>>>
>>>> gmx mdrun -nt 2 -nb cpu -pme cpu -deffnm
>>>>
>>>> mdp:
>>>>
>>>> ; Run control
>>>> integrator = md-vv ; Velocity Verlet
>>>> tinit = 0
>>>> dt = 0.002
>>>> nsteps = 500000 ; 1 ns
>>>> nstcomm = 100
>>>> ; Output control
>>>> nstxout = 50000
>>>> nstvout = 50000
>>>> nstfout = 0
>>>> nstlog = 50000
>>>> nstenergy = 50000
>>>> nstxout-compressed = 0
>>>> ; Neighborsearching and short-range nonbonded interactions
>>>> cutoff-scheme = group
>>>> nstlist = 10
>>>> ns_type = grid
>>>> pbc = xyz
>>>> rlist = 1.4
>>>> ; Electrostatics
>>>> coulombtype = cutoff
>>>> rcoulomb = 1.4
>>>> ; van der Waals
>>>> vdwtype = user
>>>> vdw-modifier = none
>>>> rvdw = 1.4
>>>> ; Apply long range dispersion corrections for Energy and Pressure
>>>> DispCorr = EnerPres
>>>> ; Spacing for the PME/PPPM FFT grid
>>>> fourierspacing = 0.12
>>>> ; EWALD/PME/PPPM parameters
>>>> pme_order = 6
>>>> ewald_rtol = 1e-06
>>>> epsilon_surface = 0
>>>> ; Temperature coupling
>>>> Tcoupl = nose-hoover
>>>> tc_grps = system
>>>> tau_t = 1.0
>>>> ref_t = some_temperature
>>>> ; Pressure coupling is off for NVT
>>>> Pcoupl = No
>>>> tau_p = 0.5
>>>> compressibility = 4.5e-05
>>>> ref_p = 1.0
>>>> ; options for bonds
>>>> constraints = all-bonds
>>>> constraint_algorithm = lincs
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, Apr 27, 2018 at 1:14 PM, Alex <nedomacho at gmail.com> wrote:
>>>>
>>>>> As I said, only two users, and nvidia-smi shows the process name. We're
>>>>> investigating and it does appear that it is EM that uses cutoff
>>>>> electrostatics and as a result the user did not bother with -pme cpu in
>>>> the
>>>>> mdrun call. What would be the correct way to enforce cpu-only mdrun
>> when
>>>>> coulombtype = cutoff?
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Alex
>>>>>
>>>>> On Fri, Apr 27, 2018 at 12:45 PM, Mark Abraham <
>> mark.j.abraham at gmail.com
>>>>> wrote:
>>>>>
>>>>>> No.
>>>>>>
>>>>>> Look at the processes that are running, e.g. with top or ps. Either
>> old
>>>>>> simulations or another user is running.
>>>>>>
>>>>>> Mark
>>>>>>
>>>>>> On Fri, Apr 27, 2018, 20:33 Alex <nedomacho at gmail.com> wrote:
>>>>>>
>>>>>>> Strange. There are only two people using this machine, myself being
>>>> one
>>>>>> of
>>>>>>> them, and the other person specifically forces -nb cpu -pme cpu in
>> his
>>>>>>> calls to mdrun. Are any other GMX utilities (e.g. insert-molecules,
>>>>>> grompp,
>>>>>>> or energy) trying to use GPUs?
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Alex
>>>>>>>
>>>>>>> On Fri, Apr 27, 2018 at 5:33 AM, Szilárd Páll <
>> pall.szilard at gmail.com
>>>>>>> wrote:
>>>>>>>
>>>>>>>> The second column is PIDs so there is a whole lot more going on
>>>> there
>>>>>>> than
>>>>>>>> just a single simulation, single rank using two GPUs. That would be
>>>>>> one
>>>>>>> PID
>>>>>>>> and two entries for the two GPUs. Are you sure you're not running
>>>>>> other
>>>>>>>> processes?
>>>>>>>>
>>>>>>>> --
>>>>>>>> Szilárd
>>>>>>>>
>>>>>>>> On Thu, Apr 26, 2018 at 5:52 AM, Alex <nedomacho at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> I am running GMX 2018 with gmx mdrun -pinoffset 0 -pin on -nt 24
>>>>>>> -ntmpi 4
>>>>>>>>> -npme 1 -pme gpu -nb gpu -gputasks 1122
>>>>>>>>>
>>>>>>>>> Once in a while the simulation slows down and nvidia-smi reports
>>>>>>>> something
>>>>>>>>> like this:
>>>>>>>>>
>>>>>>>>> | 1 12981 C gmx
>>>>>>>>> 175MiB |
>>>>>>>>> | 2 12981 C gmx
>>>>>>>>> 217MiB |
>>>>>>>>> | 2 13083 C gmx
>>>>>>>>> 161MiB |
>>>>>>>>> | 2 13086 C gmx
>>>>>>>>> 159MiB |
>>>>>>>>> | 2 13089 C gmx
>>>>>>>>> 139MiB |
>>>>>>>>> | 2 13093 C gmx
>>>>>>>>> 163MiB |
>>>>>>>>> | 2 13096 C gmx
>>>>>>>>> 11MiB |
>>>>>>>>> | 2 13099 C gmx
>>>>>>>>> 8MiB |
>>>>>>>>> | 2 13102 C gmx
>>>>>>>>> 8MiB |
>>>>>>>>> | 2 13106 C gmx
>>>>>>>>> 8MiB |
>>>>>>>>> | 2 13109 C gmx
>>>>>>>>> 8MiB |
>>>>>>>>> | 2 13112 C gmx
>>>>>>>>> 8MiB |
>>>>>>>>> | 2 13115 C gmx
>>>>>>>>> 8MiB |
>>>>>>>>> | 2 13119 C gmx
>>>>>>>>> 8MiB |
>>>>>>>>> | 2 13122 C gmx
>>>>>>>>> 8MiB |
>>>>>>>>> | 2 13125 C gmx
>>>>>>>>> 8MiB |
>>>>>>>>> | 2 13128 C gmx
>>>>>>>>> 8MiB |
>>>>>>>>> | 2 13131 C gmx
>>>>>>>>> 8MiB |
>>>>>>>>> | 2 13134 C gmx
>>>>>>>>> 8MiB |
>>>>>>>>> | 2 13138 C gmx
>>>>>>>>> 8MiB |
>>>>>>>>> | 2 13141 C gmx
>>>>>>>>> 8MiB |
>>>>>>>>> +-----------------------------------------------------------
>>>>>>>>> ------------------+
>>>>>>>>>
>>>>>>>>> Then goes back to the expected load. Is this normal?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Alex
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Gromacs Users mailing list
>>>>>>>>>
>>>>>>>>> * Please search the archive at http://www.gromacs.org/Support
>>>>>>>>> /Mailing_Lists/GMX-Users_List before posting!
>>>>>>>>>
>>>>>>>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>>>>>>>
>>>>>>>>> * For (un)subscribe requests visit
>>>>>>>>>
>>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users
>>>>>> or
>>>>>>>>> send a mail to gmx-users-request at gromacs.org.
>>>>>>>> --
>>>>>>>> Gromacs Users mailing list
>>>>>>>>
>>>>>>>> * Please search the archive at http://www.gromacs.org/
>>>>>>>> Support/Mailing_Lists/GMX-Users_List before posting!
>>>>>>>>
>>>>>>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>>>>>>
>>>>>>>> * For (un)subscribe requests visit
>>>>>>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users
>>>> or
>>>>>>>> send a mail to gmx-users-request at gromacs.org.
>>>>>>> --
>>>>>>> Gromacs Users mailing list
>>>>>>>
>>>>>>> * Please search the archive at
>>>>>>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>>>>>>> posting!
>>>>>>>
>>>>>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>>>>>
>>>>>>> * For (un)subscribe requests visit
>>>>>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users
>> or
>>>>>>> send a mail to gmx-users-request at gromacs.org.
>>>>>> --
>>>>>> Gromacs Users mailing list
>>>>>>
>>>>>> * Please search the archive at http://www.gromacs.org/Support
>>>>>> /Mailing_Lists/GMX-Users_List before posting!
>>>>>>
>>>>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>>>>
>>>>>> * For (un)subscribe requests visit
>>>>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>>>>>> send a mail to gmx-users-request at gromacs.org.
>>>>>
>>>> --
>>>> Gromacs Users mailing list
>>>>
>>>> * Please search the archive at
>>>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>>>> posting!
>>>>
>>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>>
>>>> * For (un)subscribe requests visit
>>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>>>> send a mail to gmx-users-request at gromacs.org.
>> --
>> Gromacs Users mailing list
>>
>> * Please search the archive at
>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>> posting!
>>
>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>
>> * For (un)subscribe requests visit
>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> send a mail to gmx-users-request at gromacs.org.
More information about the gromacs.org_gmx-users
mailing list