Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] RFC: Move of ompi_bitmap_t
From: Broto, Laurent G (broto_at_[hidden])
Date: 2009-02-03 15:50:23


George,

May I miss something, but the division is a division by 8 or 16 or so
one... So it's not really costly since it's just a right bit shifft.
Same thing for the modulo (with a sub operation which not costly too):

#define SIZE_OF_CHAR ((int) (sizeof(char) * 8))
index = bit / SIZE_OF_CHAR;
offset = bit % SIZE_OF_CHAR;

Please correct me if I'm wrong.

Laurent

George Bosilca wrote:
> In the current bitmap implementation every time we set or check a bit
> we have to compute the index of the char where this bit is set and the
> relative position from the beginning of char. This requires two _VERY_
> expensive operations: a division and a modulo. Compared with the cost
> of these two operation a quick test for a max bit is irrelevant.
>
> In fact I think the safety limit if good for most cases. How about
> having the max bit to the limit used to initialize the bitmap? We can
> add a call to extend the bitmap in case some layer really need to
> extend it, but restrict all others layers to the number of bits
> requested when the bitmap is initialized.
>
> george.
>
>
> On Feb 1, 2009, at 20:32 , Jeff Squyres wrote:
>
>> Sounds fine; we do create Fortran handles in some
>> performance-critical sections of code (e.g., MPI_ISEND), so
>> eliminating an extra test is not a bad thing to do.
>>
>>
>> On Feb 1, 2009, at 11:40 AM, Broto, Laurent G. wrote:
>>
>>>
>>> Hi folks,
>>>
>>> I am Laurent Broto, a Rich Graham postdoc. I'm currently working on
>>> the BTL extraction with Greg Koenig and Rainer Keller.
>>>
>>> At this time, I want to group all the *_bitmap function in only one
>>> layer.
>>>
>>> Now, you know who I am :)
>>>
>>> So, just one question. I had in my mind:
>>> - adding a max_size in the opal_bitmap_t structure,
>>> - at the init time, set this field with INT_MAX or whatever the type
>>> is _MAX,
>>> - add a set_max_size functions to change the max_size,
>>> - for each function needs this test, just do if( new_size <
>>> param->max_size) ...
>>>
>>> I guess it is more efficient than the Jeff approach who is supposed to
>>> - first test if the max size has been set,
>>> - then ensure that the bitmap never grows beyond that size.
>>>
>>> In the first approach we only do one test, in the second one, always
>>> one and sometimes two.
>>>
>>> But may I miss something...
>>>
>>> What do you think about this ?
>>>
>>> --
>>> Laurent
>>>
>>> -----Original Message-----
>>> From: devel-bounces_at_[hidden] on behalf of Jeff Squyres
>>> Sent: Sun 2/1/2009 7:37 AM
>>> To: Open MPI Developers
>>> Subject: Re: [OMPI devel] RFC: Move of ompi_bitmap_t
>>>
>>> I just looked through both opal_bitmap_t and ompi_bitmap_t and I think
>>> that the only real difference is that in the ompi version, we check
>>> (in various places) that the size of the bitmap never grows beyond
>>> OMPI_FORTRAN_HANDLE_MAX; the opal version doesn't do these kind of
>>> size checks.
>>>
>>> I think it would be fairly straightforward to:
>>>
>>> - add generic checks into the opal version, perhaps by adding a new
>>> API call (opal_bitmap_set_max_size())
>>> - if the max size has been set, then ensure that the bitmap never
>>> grows beyond that size, otherwise let it have the same behavior as
>>> today (grow without bound -- assumedly until malloc() fails)
>>>
>>> It'll take a little care to ensure to merge the functionality
>>> correctly, but it is possible. Once that is done, you can:
>>>
>>> - remove the ompi_bitmap_t class
>>> - s/ompi_bitmap/opal_bitmap/g in the OMPI layer
>>> - add new calls to opal_bitmap_set_max_size(&bitmap,
>>> OMPI_FORTRAN_HANDLE_MAX) in the OMPI layer (should only be in a few
>>> places -- probably one for each MPI handle type...? It's been so long
>>> since I've looked at that code that I don't remember offhand)
>>>
>>> I'd generally be in favor of this because, although this is not a lot
>>> of repeated code, it *is* repeated code -- so cleaning it up and
>>> consolidating the non-Fortran stuff down in opal is not a Bad Thing.
>>>
>>>
>>> On Jan 30, 2009, at 4:59 PM, Ralph Castain wrote:
>>>
>>>> The history is simple. Originally, there was one bitmap_t in orte
>>>> that was also used in ompi. Then the folks working on Fortran found
>>>> that they had to put a limit in the bitmap code to avoid getting
>>>> values outside of Fortran's range. However, this introduced a
>>>> problem - if we had the limit in the orte version, then we limited
>>>> ourselves unnecessarily, and introduced some abstraction questions
>>>> since orte knows nothing about Fortran.
>>>>
>>>> So two were created. Then the orte_bitmap_t was blown away at a
>>>> later time when we removed the GPR as George felt it wasn't
>>>> necessary (which was true). It was later reborn when we needed it in
>>>> the routed system, but this time it was done in opal as others
>>>> indicated a potential more general use for that capability.
>>>>
>>>> The problem with uniting the two is that you either have to
>>>> introduce Fortran-based limits into opal (which messes up the non-
>>>> ompi uses), or deal with the Fortran limits in some other fashion.
>>>> Neither is particularly pleasant, though it could be done.
>>>>
>>>> I think it primarily is a question for the Fortran folks to address
>>>> - can they deal with Fortran limits in some other manner without
>>>> making the code unmanageable and/or taking a performance hit?
>>>>
>>>> Ralph
>>>>
>>>>
>>>> On Jan 30, 2009, at 2:40 PM, Richard Graham wrote:
>>>>
>>>>> This should really be viewed as a code maintenance RFC. The reason
>>>>> this
>>>>> came up in the first place is because we are investigating the btl
>>>>> move, but
>>>>> these are really two very distinct issues. There are two bits of
>>>>> code that
>>>>> have virtually the same functionality - they do have the same
>>>>> interface I am
>>>>> told. The question is, is there a good reason to keep two different
>>>>> versions in the repository ? Not knowing the history of why a second
>>>>> version was created this is an inquiry. Is there some performance
>>>>> advantage, or some other advantage to having these two versions ?
>>>>>
>>>>> Rich
>>>>>
>>>>>
>>>>> On 1/30/09 3:23 PM, "Terry D. Dontje" <Terry.Dontje_at_[hidden]> wrote:
>>>>>
>>>>>> I second Brian's concern. So unless this is just an announcement
>>>>>> that
>>>>>> this is being done on a tmp branch only until everything is in
>>>>>> order I
>>>>>> think we need further discussions.
>>>>>>
>>>>>> --td
>>>>>>
>>>>>> Brian Barrett wrote:
>>>>>>> So once again, I bring up my objection of this entire line of
>>>>>>> moving
>>>>>>> until such time as the entire process is properly mapped out. I
>>>>>>> believe it's premature to being moving around code in preparation
>>>>>>> for
>>>>>>> a move that hasn't been proven viable yet. Until there is concrete
>>>>>>> evidence that such a move is possible, won't degrade application
>>>>>>> performance, and does not make the code totally unmaintainable, I
>>>>>>> believe that any related code changes should not be brought into
>>>>>>> the
>>>>>>> trunk.
>>>>>>>
>>>>>>> Brian
>>>>>>>
>>>>>>>
>>>>>>> On Jan 30, 2009, at 12:30 PM, Rainer Keller wrote:
>>>>>>>
>>>>>>>> On behalf of Laurent Broto
>>>>>>>>
>>>>>>>> RFC: Move of ompi_bitmap_t
>>>>>>>>
>>>>>>>> WHAT: Move ompi_bitmap_t into opal or onet-layer
>>>>>>>>
>>>>>>>> WHY: Remove dependency on ompi-layer.
>>>>>>>>
>>>>>>>> WHERE: ompi/class
>>>>>>>>
>>>>>>>> WHEN: Open MPI-1.4
>>>>>>>>
>>>>>>>> TIMEOUT: February 3, 2009.
>>>>>>>>
>>>>>>>> -------------------------------------
>>>>>>>> Details:
>>>>>>>> WHY:
>>>>>>>> The ompi_bitmap_t is being used in various places within
>>>>>>>> opal/orte/ompi. With
>>>>>>>> the proposed splitting of BTLs into a separate library, we are
>>>>>>>> currently
>>>>>>>> investigating several of the differences between ompi/class/* and
>>>>>>>> opal/class/*
>>>>>>>>
>>>>>>>> One of the items is the ompi_bitmap_t which is quite similar to
>>>>>>>> the
>>>>>>>> opal_bitmap_t.
>>>>>>>> The question is, whether we can remove favoring a solution just
>>>>>>>> in opal.
>>>>>>>>
>>>>>>>> WHAT:
>>>>>>>> The data structures in the opal-version are the same,
>>>>>>>> so is the interface,
>>>>>>>> the implementation is *almost* the same....
>>>>>>>>
>>>>>>>> The difference is the Fortran handles ;-]!
>>>>>>>>
>>>>>>>> Maybe we're missing something but could we have a discussion, on
>>>>>>>> why
>>>>>>>> Fortran
>>>>>>>> sizes are playing a role here, and if this is a hard
>>>>>>>> requirement, how
>>>>>>>> we could
>>>>>>>> settle that into that current interface (possibly without a
>>>>>>>> notion of
>>>>>>>> Fortran,
>>>>>>>> but rather, set some upper limit that the bitmap may grow to?)
>>>>>>>>
>>>>>>>> With best regards,
>>>>>>>> Laurent and Rainer
>>>>>>>> --
>>>>>>>> ------------------------------------------------------------------------
>>>>>>>>
>>>>>>>> Rainer Keller, PhD Tel: (865) 241-6293
>>>>>>>> Oak Ridge National Lab Fax: (865) 241-4811
>>>>>>>> PO Box 2008 MS 6164 Email: keller_at_[hidden]
>>>>>>>> Oak Ridge, TN 37831-2008 AIM/Skype: rusraink
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> devel mailing list
>>>>>>>> devel_at_[hidden]
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> devel mailing list
>>>>>> devel_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> devel_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>>
>>> --
>>> Jeff Squyres
>>> Cisco Systems
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>> --
>> Jeff Squyres
>> Cisco Systems
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel