Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] Mixing Linux's CPU-shielding with mpirun's bind-to-core
From: Siddhartha Jana (siddharthajana24_at_[hidden])
Date: 2013-08-18 12:38:56

Firstly, I would like my program to dynamically assign it self to one of
the cores it pleases and remain bound to it until it later reschedules

Ralph Castain wrote:*
*>> "If you just want mpirun to respect an external cpuset limitation, it
already does so when binding - it will bind within the external limitation"*

In my case, the limitation is enforced "internally", by the application
once in begins execution. I enforce this during program execution, after
the mpirun has finished "binding within the external limitation".

*Brice Goglin said*:
*>> "MPI can bind at two different times: inside mpirun after ssh before
running the actual program (this one would ignore your cpuset), later at
MPI_Init inside your program (this one will ignore your cpuset only if you
call MPI_Init before creating the cpuset)."*

Noted. In that case, during program execution, whose binding is respected -
mpirun's or MPI_Init()'s? From the above, is my understanding correct? That
MPI_Init() will be responsible for the 2nd round of attempting to bind
processes to cores and can override what mpirun or the programmer had
enforced before its call (using hwloc/cpuset/sched_load_balance()* *and
other *compatible* cousins) ?

If this is so, in my case the flow of events is thus:

1. mpirun binds an MPI process which is yet to begin execution. So mpirun
says: "Bind to some core - A" (I don't use any hostfile/rankfile. but I do
use the --bind-to-core flag)

2. Process begins execution on core A

3. I enforce: "Bind to core B". (we must remember, it is only at runtime
that I know what core I want to be bound to and not while launching the
processes using mpirun). So my process shifts over to core B

4. MPI_Init() once again honors rankfile mapping(if any, default policy,
otherwise ) and rebinds my process to core A

5. process finished execution and calls MPI_Finalize(), all the time on
core A

6. mpirun exits

So if I place step-3 above after step-4, my request will hold for the rest
of the execution. Please do let me know, if my understanding is correct.

Thanks for all the help

Siddhartha Jana

On 18 August 2013 10:49, Ralph Castain <rhc_at_[hidden]> wrote:

> If you require that a specific rank go to a specific core, then use the
> rankfile mapper - you can see explanations on the syntax in "man mpirun"
> If you just want mpirun to respect an external cpuset limitation, it
> already does so when binding - it will bind within the external limitation
> On Aug 18, 2013, at 6:09 AM, Siddhartha Jana <siddharthajana24_at_[hidden]>
> wrote:
> So my question really boils down to:
> How does one ensure that mpirun launches the processes on the "specific"
> cores that are expected of them to be bound to.
> As I mentioned, if there were a way to specify the cores through the
> hostfile, this problem should be solved.
> Thanks for all the quick replies,
> -- Sid
> On 18 August 2013 09:04, Siddhartha Jana <siddharthajana24_at_[hidden]>wrote:
>> Thanks John. But I have an incredibly small system. 2 nodes - 16 cores
>> each.
>> 2-4 MPI processes. :-)
>> On 18 August 2013 09:03, John Hearns <hearnsj_at_[hidden]> wrote:
>>> You really should install a job scheduler.
>>> There are free versions.
>>> I'm not sure about cpuset support in Gridengine. Anyone?
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
> _______________________________________________
> users mailing list
> users_at_[hidden]
> _______________________________________________
> users mailing list
> users_at_[hidden]