Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] cleanup of rr_byobj
From: tmishima_at_[hidden]
Date: 2014-03-27 20:21:11


I added two improvements. Please replace the previous patch file
by this attached one, and take a look this week end.

1. Add pre-check for ORTE_ERR_NOT_FOUND to make retry with byslot
work afterward correctly. Otherwise, the retry could fail, because
some fields such as node->procs, node->slots_inuse is already
updated.

2. Improve the detection of oversubscription, when node->slots is not
multiple number of cpus_per_rank. For example, using node05, node06
with slots = 8 and setting cpus_per_rank = 3, np = 5 should be
oversubscribed, although np x cpus_per_rank(3X5=15) is less than
num_slots(=16). I fixed to detect this oversubscription.

Tetsuya

(See attached file: patch.byobj2)

> Hi Tetsuya
>
> Let me take a look when I get home this weekend - I'm giving an ORTE
tutorial to a group of new developers this week and my time is very
limited.
>
> Thanks
> Ralph
>
>
>
> On Tue, Mar 25, 2014 at 5:37 PM, <tmishima_at_[hidden]>wrote:
>
> Hi Ralph, I moved on to the development list.
>
> I'm not sure why add_one flag is used in the rr_byobj.
> Here, if oversubscribed, proc is mapped to each object
> one by one. So, I think the add_one is not necesarry.
>
> Instead, when the user doesn't permit oversubscription,
> the second pass should be skipped.
>
> I made the logic a bit clear based upon this idea and
> removed some outputs to synchronize it with the 1.7 branch.
>
> Please take a look at attached patch file.
>
> Tetsuya
>
> (See attached file: patch.byobj)
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
http://www.open-mpi.org/community/lists/devel/2014/03/14393.php_______________________________________________

> devel mailing list
> devel_at_[hidden]
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/develLink to
this post: http://www.open-mpi.org/community/lists/devel/2014/03/14394.php