Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] neighborhood collectives issues
From: George Bosilca (bosilca_at_[hidden])
Date: 2013-09-27 11:44:31


Patch looks good. Thanks for the fix.

  George.

On Sep 27, 2013, at 17:32 , Nathan Hjelm <hjelmn_at_[hidden]> wrote:

> Would help to attach the workaround. Attached.
>
> -Nathan
>
> On Fri, Sep 27, 2013 at 09:31:08AM -0600, Nathan Hjelm wrote:
>> On Fri, Sep 27, 2013 at 01:01:01PM +0000, Jeff Squyres (jsquyres) wrote:
>>> On Sep 27, 2013, at 3:27 AM, George Bosilca <bosilca_at_[hidden]> wrote:
>>>
>>>> The addition of the neighborhood collectives to the mca_coll_base_comm_coll_t structure increased the size of the ompi_communicator_t structure over the limit of the predefined padding (PREDEFINED_COMMUNICATOR_PAD). This is not a small change, it will break the ABI with all past version of Open MPI.
>>>
>>> This is going to be problematic for putting this in 1.7.4.
>>>
>>> Nathan: is there another way? Perhaps even just a stopgap way for the 1.7/1.8 series, and we can keep the "real" way for 1.9+? I.e., perhaps:
>>>
>>> 1. keep PREDEFINED_COMMUNICATOR_PAD at current value for v1.7.x/1.8, but use a secondary pointer system (which won't be *too* painful; the algorithms are all simple/not optimized, anyway)
>>>
>>> 2. increase PREDEFINED_COMMUNICATOR_PAD on the trunk for v1.9+ (we might want to increase it more than it is already increased, so that we actually have some breathing room for 1.9+)
>>>
>>>> I pushed a temporary commit to allow the trunk to be built, but we might want a better solution.
>>
>> Ok, it looks like the structure was exactly 128 * sizeof (void *) without peruse. So enabling peruse
>> would make it go over the max. Attached is a work around so we don't have to increase the size of
>> the communicator for 1.7.x. George, let me know if you think this solution is acceptable.
>>
>>> Thanks George.
>>>
>>>> There a re a new warnings:
>>>> ../../../../../ompi/ompi/mca/coll/libnbc/coll_libnbc_component.c: In function 'libnbc_comm_query':
>>>> ../../../../../ompi/ompi/mca/coll/libnbc/coll_libnbc_component.c:196:48: warning: assignment from incompatible pointer type [enabled by default]
>>>> ../../../../../ompi/ompi/mca/coll/libnbc/coll_libnbc_component.c:197:49: warning: assignment from incompatible pointer type [enabled by default]
>>>> ../../../../../ompi/ompi/mca/coll/libnbc/coll_libnbc_component.c:198:47: warning: assignment from incompatible pointer type [enabled by default]
>>>> ../../../../../ompi/ompi/mca/coll/libnbc/coll_libnbc_component.c:199:48: warning: assignment from incompatible pointer type [enabled by default]
>>>> ../../../../../ompi/ompi/mca/coll/libnbc/coll_libnbc_component.c:200:48: warning: assignment from incompatible pointer type [enabled by default]
>>>
>>>
>>> Nathan: please fix.
>>
>> Ok. Will commit a fix an add a comment to coll.h that increasing the size of mca_coll_base_comm_coll_t might
>> require PREDEFINED_COMMUNICATOR_PAD to be increased. I didn't see an issue with the communicator size because
>> I never modified the communicator directly.
>>
>> -Nathan
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> <0001-Prepare-the-neighborhood-collectives-for-1.7.x.patch.gz>_______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel