Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts
From: Dave Love (d.love_at_[hidden])
Date: 2010-10-14 07:23:53

Reuti <reuti_at_[hidden]> writes:

> With the default binding_instance set to "set" (the default) the
> shepherd should bind the processes to cores already. With other types
> of binding_instance these selected cores must be forward to the
> application via an environment variable or in the hostfile.

My question was specifically about SGE/OMPI tight integration; are you
actually doing binding successfully with that? I think I read here that
the integration doesn't (yet?) deal with SGE core binding, and when we
turned on the SGE feature we got the OMPI tasks piled onto a single
core. We quickly turned it off for MPI jobs when we realized what was
happening, and I didn't try to investigate further.

> As this is only a hint to SGE and not a hard request, the user must
> plan a little bit the allocation beforehand. Especially if you
> oversubscribe a machine it won't work.

[It is documented that the binding isn't applied if the selected cores
are occupied.]