Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] carto vs. hwloc
From: Terry Dontje (Terry.Dontje_at_[hidden])
Date: 2009-12-15 09:41:34


Kenneth Lloyd wrote:
> My 2 cents: Carto is a weighted graph structure that describes the topology
> of the compute cluster, not just locations of nodes. Many view topologies
> (trees, meshes, torii) to be static - but I've found this an unnecessary and
> undesirable constraint.
>
> The compute fabric may better be left open to dynamic configuration,
> dependent upon the partitioning of jobs, tasks and data to be run.
>
> How do others see this?
>
>
At a network and actually even a node's resource level I think a case
can be made for a dynamically changing topology as you mention above.
However, is MPI the right level to compensate for interfaces coming and
going?

It would be nice/cool if there was an APM like feature that spanned HCAs
and not just between ports on the same HCA available at a network api
level. I know why this is currently done the way it is for IB but it
always struck me that you'd want to handle interface/path changes below
MPI. That way more than just MPI codes could reap the benefits.

At a node level the whole locality issue of a process in relation to its
memory or other processes seem to cry out to possibly be more of a OS
type of job than MPI. Reason being is first you could end up with quite
a complex layout for a job and second things really become complicated
if you want to take into account other MPI jobs.

The above being said, I don't hold too much hope that things below MPI
will actually take on these tasks, even though it seems like a logical
level for these things to occur IMO.

Anyways, I think keeping in mind dynamic changes is well worth it but it
seems to start moving there from a static position makes a lot of sense.

--td
> Ken Lloyd
>
>
>> -----Original Message-----
>> From: devel-bounces_at_[hidden]
>> [mailto:devel-bounces_at_[hidden]] On Behalf Of Jeff Squyres
>> Sent: Monday, December 14, 2009 6:47 PM
>> To: Open MPI Developers List
>> Subject: Re: [OMPI devel] carto vs. hwloc
>>
>> I had a nice chat with Ralph this afternoon about this topic.
>>
>> He pointed out a few things to me:
>>
>> - I had forgotten (ahem) that carto has weights associated
>> with each of its edges (and that's kind of a defining
>> feature). hwloc, at present, does not. So perhaps hwloc
>> would not initially replace carto -- maybe in some future
>> future hwloc version.
>>
>> - He also pointed out that not only paffinity, but also
>> sysinfo, could be replaced if hwloc comes in.
>>
>> He also made a good point that hwloc is only "sorta"
>> extensible right now -- meaning that, sure, you can add
>> support for new OS's and platforms, but not in as easy/clean
>> a way as we have in Open MPI. Specifically, adding new
>> support right now means editing much of the current hwloc
>> code: configure, adding #if's to the top-level tools and
>> library core, etc. It's not nearly as clean as just adding a
>> new plugin that is totally independent of the rest of the
>> code base. He thought it would be [greatly] beneficial if
>> hwloc uses the same plugin system as Open MPI before bringing
>> it in. Indeed, Open MPI may wish to extend hwloc in ways
>> that the main hwloc project is not interested in extending
>> (e.g., supporting some of Cisco's custom hardware). Fair point.
>>
>> Additionally, the topic of plugins came up within the context
>> of heterogeneity: have code to get the topology of the
>> machine (RAM + processors), but have separate code to mix in
>> accelerators/co-processors and other entities in the box.
>> One could easily imagine plugins for each different type of
>> entity that you would want to detect within a server.
>>
>> To some extent, the hwloc crew has already been discussing
>> these issues -- we can probably work elements of much of it
>> into what we're doing. For example, Brice and Samuel are
>> working on adding PCI device support to hwloc (although I
>> haven't been following the details of what they're doing).
>> We've also talked about adding hwloc functions for editing
>> the map that comes back. For example, hwloc could be used as
>> the cornerstone for a new OPAL framework base, and new
>> plugins in this base can use functions to add more
>> information to the initial map that is reported back by the
>> hwloc core. [shrug] Need to think about that more.
>>
>> This is all excellent feedback (I need to take it back to the
>> hwloc crew); please let me know what else you think about
>> these ideas tomorrow on the call.
>>
>>
>>
>> On Dec 14, 2009, at 4:13 PM, Jeff Squyres wrote:
>>
>>
>>> Question for everyone (possibly a topic for tomorrow's call...):
>>>
>>> hwloc is evolving into a fairly nice package. It's not
>>>
>> ready for inclusion into Open MPI yet, but it's getting
>> there. I predict it will come in somewhere early in the 1.5
>> series (potentially not 1.5.0, though). hwloc will provide
>> two things:
>>
>>> 1. A listing of all processors and memory, to include
>>>
>> caches (and cache sizes!) laid out in a map, so you can see
>> what processors share what memory (e.g., caches). Open MPI
>> currently does not have this capability. Additionally, hwloc
>> is currently growing support to include PCI devices in the
>> map; that may make it into hwloc v1.0 or not.
>>
>>> 2. Cross-platform / OS support. hwloc currently support a
>>>
>> nice variety of OSs and hardware platforms.
>>
>>> Given that hwloc is already cross-platform, do we really
>>>
>> need the carto framework? I.e., do we really need multiple
>> carto plugins? More specifically: should we just use hwloc
>> directly -- with no framework?
>>
>>> Random points:
>>>
>>> - I'm about halfway finished with "embedding" code for
>>>
>> hwloc like PLPA has, so, for example, all of hwloc's symbols
>> can be prepended with opal_ or orte_ or whatever. Hence,
>> embedding hwloc in OMPI would be "safe".
>>
>>> - If we keep the carto framework, then we'll have to
>>>
>> translate from hwloc's map to carto's map; there may be
>> subtleties involved in the translation.
>>
>>> - I guarantee that [much] more thought has been put into
>>>
>> the hwloc map data structure design than carto's. :-)
>> Indeed, to make all of hwloc's data available to OMPI,
>> carto's map data structures may end up evolving to look
>> pretty much exactly like hwloc's. In which case -- what's
>> the point of carto?
>>
>>> Thoughts?
>>>
>>> hwloc also provides processor binding functions, so it
>>>
>> might also make the paffinity framework moot...
>>
>>> --
>>> Jeff Squyres
>>> jsquyres_at_[hidden]
>>>
>>>
>> --
>> Jeff Squyres
>> jsquyres_at_[hidden]
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>