John Hearns <hearnsj_at_[hidden]> writes:
> Agree with what you say Dave.
> Regarding not wanting jobs to use certsin cores ie. reserving low-numbered
> cores for OS processes then surely a good way forward is to use a 'boot
> cpuset' of one or two cores and let your jobs run on the rest of the cores.
Maybe, if you make sure the resource manager knows about it, and users
don't mind losing the cores, presumably resulting in a cock-eyed MPI
process distribution. Is it really necessary, compared with simply
using core binding?
I'd expect the bulk of overheads to be due to the resource manager,
especially if it tracks things by grovelling /proc frequently, not to
the OS. In cases I've measured, it's typically ~1%, depending on
parameters, scaling more slowly than core count.
> You're right about cpusets being helpful with 'badly behaved' jobs.
> War stories some other time!
Well [trying to bring this on topic], things got much more sanitary here
after I replaced the wretched Streamline-supplied setup with tight
integration of OMPI under SGE and then made the SGE core binding
inherited by OMPI work sensibly with partially full nodes.