Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Mixing Linux's CPU-shielding with mpirun's bind-to-core
From: Ralph Castain (rhc_at_[hidden])
Date: 2013-08-18 13:11:58


A process can always change its binding by "re-binding" to wherever it wants after MPI_Init completes.

On Aug 18, 2013, at 9:38 AM, Siddhartha Jana <siddharthajana24_at_[hidden]> wrote:

> Firstly, I would like my program to dynamically assign it self to one of the cores it pleases and remain bound to it until it later reschedules itself.
>
> Ralph Castain wrote:
> >> "If you just want mpirun to respect an external cpuset limitation, it already does so when binding - it will bind within the external limitation"
>
> In my case, the limitation is enforced "internally", by the application once in begins execution. I enforce this during program execution, after the mpirun has finished "binding within the external limitation".
>
>
> Brice Goglin said:
> >> "MPI can bind at two different times: inside mpirun after ssh before running the actual program (this one would ignore your cpuset), later at MPI_Init inside your program (this one will ignore your cpuset only if you call MPI_Init before creating the cpuset)."
>
> Noted. In that case, during program execution, whose binding is respected - mpirun's or MPI_Init()'s? From the above, is my understanding correct? That MPI_Init() will be responsible for the 2nd round of attempting to bind processes to cores and can override what mpirun or the programmer had enforced before its call (using hwloc/cpuset/sched_load_balance() and other compatible cousins) ?
>
>
> --------------------------------------------
> If this is so, in my case the flow of events is thus:
>
> 1. mpirun binds an MPI process which is yet to begin execution. So mpirun says: "Bind to some core - A" (I don't use any hostfile/rankfile. but I do use the --bind-to-core flag)
>
> 2. Process begins execution on core A
>
> 3. I enforce: "Bind to core B". (we must remember, it is only at runtime that I know what core I want to be bound to and not while launching the processes using mpirun). So my process shifts over to core B
>
> 4. MPI_Init() once again honors rankfile mapping(if any, default policy, otherwise ) and rebinds my process to core A
>
> 5. process finished execution and calls MPI_Finalize(), all the time on core A
>
> 6. mpirun exits
> --------------------------------------------
>
> So if I place step-3 above after step-4, my request will hold for the rest of the execution. Please do let me know, if my understanding is correct.
>
> Thanks for all the help
>
> Sincerely,
> Siddhartha Jana
> HPCTools
>
>
>
>
>
>
>
>
> On 18 August 2013 10:49, Ralph Castain <rhc_at_[hidden]> wrote:
> If you require that a specific rank go to a specific core, then use the rankfile mapper - you can see explanations on the syntax in "man mpirun"
>
> If you just want mpirun to respect an external cpuset limitation, it already does so when binding - it will bind within the external limitation
>
>
> On Aug 18, 2013, at 6:09 AM, Siddhartha Jana <siddharthajana24_at_[hidden]> wrote:
>
>> So my question really boils down to:
>> How does one ensure that mpirun launches the processes on the "specific" cores that are expected of them to be bound to.
>> As I mentioned, if there were a way to specify the cores through the hostfile, this problem should be solved.
>>
>> Thanks for all the quick replies,
>> -- Sid
>>
>> On 18 August 2013 09:04, Siddhartha Jana <siddharthajana24_at_[hidden]> wrote:
>> Thanks John. But I have an incredibly small system. 2 nodes - 16 cores each.
>> 2-4 MPI processes. :-)
>>
>> On 18 August 2013 09:03, John Hearns <hearnsj_at_[hidden]> wrote:
>> You really should install a job scheduler.
>> There are free versions.
>>
>> I'm not sure about cpuset support in Gridengine. Anyone?
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users