Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] btl/openib: get_ib_dev_distance doesn't see processes as bound if the job has been launched by srun
From: Brice Goglin (Brice.Goglin_at_[hidden])
Date: 2012-02-09 16:19:42


Le 09/02/2012 14:00, Ralph Castain a écrit :
> There is another aspect, though - I had missed it in the thread, but the question Nadia was addressing is: how to tell I am bound? The way we currently do it is to compare our cpuset against the local cpuset - if we are on a subset, then we know we are bound.
>
> So if all hwloc returns to us is our cpuset, then we cannot make that determination. Yet I do see a utility as well in only showing our own cpus.

Each hwloc object has several "cpuset" fields describing whether CPUs
are online or not, and accessible or not. Here are their meaning when
the WHOLE_SYSTEM flag is NOT set:
* "cpuset" only contains CPUs that are online and accessible
* "online_cpuset" is "cpuset" + CPUs that are online but not accessible
* "allowed_cpuset" is "cpuset" + CPUs that are accessible but not online
* "complete_cpuset" is everything

So you can find out that you are "bound" by a Linux cgroup (I am not
saying Linux "cpuset" to avoid confusion) by comparing root->cpuset and
root->online_cpuset.

Brice

> Would it make sense to add a field to the hwloc_obj_t that contains the "accessible" cpus? Or a flag indicating "you are bound to a subset of all available cpus"?
>
> Really, all we need is the flag - but we could compute it ourselves if we had the larger scope info.