Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r25445
From: George Bosilca (bosilca_at_[hidden])
Date: 2011-11-07 22:58:07


There is one major reason, if you look at them as a whole they look half-baked. And we are not supposed to accept such commits in the trunk. Development branches are the way to go, propose coherent and complete patches. Bringing disconnected pieces, and claiming that magically they will compose a whole later on, is not an acceptable approach.

I'll take as an example the atomic operations. They exist only on amd64, and only if the compiler supports gcc-like assembly. However, the atomic operation is defined in a global header with a very exciting name atomic.h. If anybody else start using these atomics (and as they appear in a header file I don't see why not), there is a very high probability of weird build or execution errors. [foot-note: Amazingly, all the productive-quality lovers in this community vanished lately.] I let you imagine what would have been the answer from the community if you had proposed such an RFC before doing the actual commit.

I'm looking forward to the cleaning part.

  george.

PS: Regarding the hand-copy instead of the memcpy, we tried to avoid using memcpy in performance critical codes, especially when we know the size of the data and the alignment. This relieves the compiler of adding ugly intrinsics, allowing it to nicely pipeline to load/stores. Anyway, with both approaches you will copy more data than needed for all BTLs except uGNI.

On Nov 7, 2011, at 21:48 , Nathan T. Hjelm wrote:

>
>
> On Mon, 7 Nov 2011 17:18:42 -0500, George Bosilca <bosilca_at_[hidden]>
> wrote:
>> A little bit of history:
>>
>> 1. r25305: added 2 atomic operations to OPAL. However, they only exists
> on
>> amd64 and are only used in the vader BTL, which I assume only supports
>> amd64.
>
> Two things:
> - The atomic is a new feature that has no impact on existing code. It can
> also be implemented on Intel but we have not tested it (yet).
> - The atomic was pushed to support lock-free queues in the Vader BTL.
> Vader does not need the atomics and can use an atomic lock lock but I see
> higher latencies when using locks.
>
> Why would this change (that has no impact on any other code) need an RFC?
>
>> 2. r25334: The seg_key union got a new member ptr. This member is solely
>> used in the vader BTL, as all other BTL use a compiler trick to convert a
>> pointer to a 64 bits.
>
> I am actually going to remove that member. I prefer the use of uintptr_t
> over casting to a uint64_t but it has no real benefit and possibly a
> pitfall due to its platform dependent size.
>
> But the member has, like the atomic, no impact on any exiting code. It does
> not change the size of the seg_key and was only used by Vader. Why would
> this change have required an RFC?
>
>> 3. r25445: All members of the seg_key union got friends, because Cray
> dare
>> to set their keys at 128 bits long. However a quick
>> find . -name "*.[ch]" -exec grep -Hn seg_key {} \; | grep "\[1\]"
>> indicates that no BTL is using 128 bits keys. Code has been added to all
>> PMLs, but I guess they just copy empty data.
>
> For now they copy empty data but in the near future (as I have said) we
> will need to bits for the ugni btl (Cray XE Gemini). I pushed this code to
> prepare for pushing ugni.
>
> Also, you might be a good person to ask: Why do we copy each member of a
> segment individually in the PMLs? Wouldn't it be faster to do a memcpy? If
> we were using a memcpy I would not have had to make any change to the pmls.
>
>> What I see is a pattern of commits that can have been dealt with
>> differently. None had an RFC, and most of them are not even used.
>
> I think you are reaching a little here. I pushed several changes over a
> period of a month. The first two are not related to the third which is the
> only one that could have any impact to existing code and might require an
> RFC.
>
> In retrospect I should have done a RFC for the 3rd change with a short
> timeout. At the time (operating on little sleep) it seemed like the commits
> would have minimal impact. Please let me know if the commits have any
> negative impact.
>
> -Nathan
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel