Ah; I thought this data structure was used for the fortran MPI object
handles as well, but looking at the code now, I see that
opal_pointer_array's are used for that. The only Fortran place it is
used is for attributes.
I see ompi_bitmap_t used for attributes, BTL reachability
computations, and the crcp. None of these strike me as performance
sensitive at all. I see it used in the tuned collectives but I agree
that one additional compare probably doesn't matter (sure, if we can
remove it, that's great -- but in the larger scheme of things, we're
going to pay much more in latency for collectives than one comparison
I see Ralph's argument about the max Fortran value being defined in
opal_config.h, but remember that that's only a side-effect of how
Autoconf works. If Autoconf had allowed us, we would have had 3 truly
different files seeded with different #defines from configure:
opal_config.h, orte_config.h, and ompi_config.h (right now we dump
everything in opal_config.h the others essentially wholly include
opal_config.h and add a few more values/defines -- we don't have a
clear separation of each layer's results from configure).
I think setting some reasonable max size for ompi_bitmap_t is fine
with a new API call to re-define it if desired would be fine (e.g.,
the MPI layer can call it with the max fortran value to ensure that it
has the size that it needs for attributes). But if someone wants to
re-code the whole thing to have a definite max size (i.e., not re-size
if you set a bit that doesn't yet exist), go to it. I don't really
care. It strikes me that there's more important stuff to do in our
code base than to optimize our glue bitmap code, though. :-)
Ok, I'm now done talking about this. :-)
On Feb 3, 2009, at 4:45 PM, George Bosilca wrote:
> These places are easy to find and track. I did it on the ORTE layer,
> and in this context the bitmap is _NOT_ required to grow as all
> bitmaps are initialize with the number of processes in the jobid. In
> the OMPI layer there are few places using the bitmap:
> - the hierarch collective. There the bitmap is initialized with the
> size of the communicator, so it will _NEVER_ get expanded.
> - in the PML (DR and OB1). Again the bitmap is initialized using the
> number of processes, so it will _NEVER_ get expanded.
> - in the attributes. This is the only place where the bitmap might
> expand. However, as the current implementation is not thread safe
> and as this call is outside the critical path, we can modify it in
> order to expand the bitmap manually.
> So, it appears that we don't really take advantage of the original
> design for the bitmap. It might be time to revise it ...
> On Feb 3, 2009, at 15:30 , Jeff Squyres wrote:
>> On Feb 3, 2009, at 3:24 PM, George Bosilca wrote:
>>> In the current bitmap implementation every time we set or check a
>>> bit we have to compute the index of the char where this bit is set
>>> and the relative position from the beginning of char. This
>>> requires two _VERY_ expensive operations: a division and a modulo.
>>> Compared with the cost of these two operation a quick test for a
>>> max bit is irrelevant.
>>> In fact I think the safety limit if good for most cases. How about
>>> having the max bit to the limit used to initialize the bitmap? We
>>> can add a call to extend the bitmap in case some layer really need
>>> to extend it, but restrict all others layers to the number of bits
>>> requested when the bitmap is initialized.
>> The problem with this is that the original design expands the
>> bitmap whenever you try to set a bit that doesn't yet exist. So
>> you'd need to track down every place in the code that exercises
>> this assumption.
>> You could set a max size if you want to (e.g., assuming you'll
>> never have more than some_large_value of fortran handles [probably
>> considerably less than the number of Fortran integers available],
>> or somesuch).
>> Jeff Squyres
>> Cisco Systems
>> devel mailing list
> devel mailing list