Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] RFC: Move the Open MPI communication infrastructure in OPAL
From: Jeff Squyres (jsquyres) (jsquyres_at_[hidden])
Date: 2014-07-23 22:20:18


Sweet; I'll have a look at all of that -- thanks.

On Jul 23, 2014, at 10:15 PM, George Bosilca <bosilca_at_[hidden]> wrote:

> I was struggling with a similar issue while trying to fix the OpenIB compilation. And I choose to implement a different approach, which does not require knowledge of what’s inside opal_process_name_t.
>
> Look in opal/util/proc.h. You should be able to use: opal_process_name_vpid and opal_process_name_jobid. They will remain there until we figure out a nice way to get rid of them completely.
>
> HINT: I personally prefer to get rid of void and jobid completely. As long as need the info only for a visual clue, the output of OPAL_NAME_PRINT might be enough.
>
> George.
>
> On Jul 23, 2014, at 22:11 , Jeff Squyres (jsquyres) <jsquyres_at_[hidden]> wrote:
>
>> Ralph and I chatted in IM.
>>
>> For the moment, I'm masking off the lower 32 bits to get the VPID, the uppermost 16 as the job family, and the next 16 as the sub-family.
>>
>> If George makes the name be a handle with accessors to get the parts, we can switch to using that.
>>
>>
>>
>> On Jul 23, 2014, at 9:57 PM, Ralph Castain <rhc_at_[hidden]> wrote:
>>
>>> You should be able to memcpy it to an ompi_process_name_t and then extract it as usual
>>>
>>>
>>> On Jul 23, 2014, at 6:51 PM, Jeff Squyres (jsquyres) <jsquyres_at_[hidden]> wrote:
>>>
>>>> George --
>>>>
>>>> Is there a way to get the MPI_COMM_WORLD rank of an opal_process_name_t?
>>>>
>>>> I am currently outputting some information about peer processes in the usnic BTL to include the peer's VPID, which is the MCW rank. I'll be sad if that goes away...
>>>>
>>>>
>>>> On Jul 15, 2014, at 2:06 AM, George Bosilca <bosilca_at_[hidden]> wrote:
>>>>
>>>>> Ralph,
>>>>>
>>>>> There are two reasons that prevent me from pushing this RFC forward.
>>>>>
>>>>> 1. Minor: The code has some minor issues related to the last set of BTL/PML changes, and I didn't found the time to fix them.
>>>>>
>>>>> 2. Major: Not all BTLs have been updated and validated. What we need at this point from their respective developers is a little help with the validation process. We need to validate that the new code works as expected and passes all tests.
>>>>>
>>>>> The move will be ready to go as soon as all BTL developers raise the green flag. I got it from Jeff (but the last USNIC commit broke something), and myself. In other words, TCP, self, SM and USNIC are good to go. For the others, as I didn't heard back from their developers/maintainers, I assume they are not yet ready. Here I am referring to OpenIB, Portals4, Scif, smcuda, ugni, usnic and vader.
>>>>>
>>>>> George.
>>>>>
>>>>> PS: As a reminder the code is available at https://bitbucket.org/bosilca/ompi-btl
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Jul 11, 2014 at 3:17 PM, Pritchard, Howard P <howardp_at_[hidden]> wrote:
>>>>> Hi Folks,
>>>>>
>>>>> Now work is planned for the uGNI BTL at this time either.
>>>>>
>>>>> Howard
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: devel [mailto:devel-bounces_at_[hidden]] On Behalf Of Jeff Squyres (jsquyres)
>>>>> Sent: Thursday, July 10, 2014 5:04 PM
>>>>> To: Open MPI Developers List
>>>>> Subject: Re: [OMPI devel] RFC: Move the Open MPI communication infrastructure in OPAL
>>>>>
>>>>> FWIW: I can't speak for other BTL maintainers, but I'm out of the office for the next week, and the usnic BTL will be standing still during that time. Once I return, I will be making additional changes in the usnic BTL (new features, updates, ...etc.).
>>>>>
>>>>> So if you have the cycles, doing it in the next week or so would be good because at least there will be no conflicts with usnic BTL concurrent development. :-)
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Jul 10, 2014, at 2:56 PM, Ralph Castain <rhc_at_[hidden]> wrote:
>>>>>
>>>>>> George: any update on when this will happen?
>>>>>>
>>>>>>
>>>>>> On Jun 4, 2014, at 9:14 PM, George Bosilca <bosilca_at_[hidden]> wrote:
>>>>>>
>>>>>>> WHAT: Open our low-level communication infrastructure by moving all
>>>>>>> necessary components
>>>>>>> (btl/rcache/allocator/mpool) down in OPAL
>>>>>>>
>>>>>>> WHY: All the components required for inter-process communications are
>>>>>>> currently deeply integrated in the OMPI
>>>>>>> layer. Several groups/institutions have express interest
>>>>>>> in having a more generic communication
>>>>>>> infrastructure, without all the OMPI layer dependencies.
>>>>>>> This communication layer should be made
>>>>>>> available at a different software level, available to all
>>>>>>> layers in the Open MPI software stack. As an
>>>>>>> example, our ORTE layer could replace the current OOB and
>>>>>>> instead use the BTL directly, gaining
>>>>>>> access to more reactive network interfaces than TCP.
>>>>>>> Similarly, external software libraries could take
>>>>>>> advantage of our highly optimized AM (active message)
>>>>>>> communication layer for their own purpose.
>>>>>>>
>>>>>>> UTK with support from Sandia, developped a version of
>>>>>>> Open MPI where the entire communication
>>>>>>> infrastucture has been moved down to OPAL
>>>>>>> (btl/rcache/allocator/mpool). Most of the moved
>>>>>>> components have been updated to match the new schema,
>>>>>>> with few exceptions (mainly BTLs
>>>>>>> where I have no way of compiling/testing them). Thus, the
>>>>>>> completion of this RFC is tied to
>>>>>>> being able to completing this move for all BTLs. For this
>>>>>>> we need help from the rest of the Open MPI
>>>>>>> community, especially those supporting some of the BTLs.
>>>>>>> A non-exhaustive list of BTLs that
>>>>>>> qualify here is: mx, portals4, scif, udapl, ugni, usnic.
>>>>>>>
>>>>>>> WHERE: bitbucket.org/bosilca/ompi-btl (updated today with respect to
>>>>>>> trunk r31952)
>>>>>>>
>>>>>>> TIMEOUT: After all the BTLs have been amended to match the new
>>>>>>> location and usage. We will discuss
>>>>>>> the last bits regarding this RFC at the Open MPI
>>>>>>> developers meeting in Chicago, June 24-26. The
>>>>>>> RFC will become final only after the meeting.
>>>>>>> _______________________________________________
>>>>>>> devel mailing list
>>>>>>> devel_at_[hidden]
>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>> Link to this post:
>>>>>>> http://www.open-mpi.org/community/lists/devel/2014/06/14974.php
>>>>>>
>>>>>> _______________________________________________
>>>>>> devel mailing list
>>>>>> devel_at_[hidden]
>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>> Link to this post:
>>>>>> http://www.open-mpi.org/community/lists/devel/2014/07/15100.php
>>>>>
>>>>>
>>>>> --
>>>>> Jeff Squyres
>>>>> jsquyres_at_[hidden]
>>>>> For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
>>>>>
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> devel_at_[hidden]
>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>> Link to this post: http://www.open-mpi.org/community/lists/devel/2014/07/15104.php
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> devel_at_[hidden]
>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>> Link to this post: http://www.open-mpi.org/community/lists/devel/2014/07/15111.php
>>>>>
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> devel_at_[hidden]
>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>> Link to this post: http://www.open-mpi.org/community/lists/devel/2014/07/15142.php
>>>>
>>>>
>>>> --
>>>> Jeff Squyres
>>>> jsquyres_at_[hidden]
>>>> For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
>>>>
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>> Link to this post: http://www.open-mpi.org/community/lists/devel/2014/07/15225.php
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> Link to this post: http://www.open-mpi.org/community/lists/devel/2014/07/15226.php
>>
>>
>> --
>> Jeff Squyres
>> jsquyres_at_[hidden]
>> For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post: http://www.open-mpi.org/community/lists/devel/2014/07/15227.php
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: http://www.open-mpi.org/community/lists/devel/2014/07/15228.php

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/