Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: George Bosilca (bosilca_at_[hidden])
Date: 2007-05-29 14:40:59


Commit r14791 apply this patch to the trunk. Let me know if you
encounter any kind of troubles.

   Thanks,
     george.

On May 29, 2007, at 2:28 PM, Ralph Castain wrote:

> After some work off-list with Tim, it appears that something has
> been broken
> again on the OMPI trunk with respect to comm_spawn. It was working
> two weeks
> ago, but...sigh.
>
> Anyway, it doesn't appear to have any bearing either way on George's
> patch(es), so whomever wants to commit them is welcome to do so.
>
> Thanks
> Ralph
>
>
> On 5/29/07 11:44 AM, "Ralph Castain" <rhc_at_[hidden]> wrote:
>
>>
>>
>>
>> On 5/29/07 11:02 AM, "Tim Prins" <tprins_at_[hidden]> wrote:
>>
>>> Well, after fixing many of the tests...
>>
>> Interesting - they worked fine for me. Perhaps a difference in
>> environment.
>>
>>> It passes all the tests
>>> except the spawn tests. However, the spawn tests are seriously
>>> broken
>>> without this patch as well, and the ibm mpi spawn tests seem to work
>>> fine.
>>
>> Then something is seriously wrong. The spawn tests were working as
>> of my
>> last commit - that is a test I religiously run. If the spawn test
>> here
>> doesn't work, then it is hard to understand how the mpi spawn can
>> work since
>> the call is identical.
>>
>> Let me see what's wrong first...
>>
>>>
>>> As far as I'm concerned, this should assuage any fear of problems
>>> with these changes and they should now go in.
>>>
>>> Tim
>>>
>>> On May 29, 2007, at 11:34 AM, Ralph Castain wrote:
>>>
>>>> Well, I'll be the voice of caution again...
>>>>
>>>> Tim: did you run all of the orte tests in the orte/test/system
>>>> directory? If
>>>> so, and they all run correctly, then I have no issue with doing the
>>>> commit.
>>>> If not, then I would ask that we not do the commit until that has
>>>> been done.
>>>>
>>>> In running those tests, you need to run them on a multi-node
>>>> system, both
>>>> using mpirun and as singletons (you'll have to look at the tests to
>>>> see
>>>> which ones make sense in the latter case). This will ensure that we
>>>> have at
>>>> least some degree of coverage.
>>>>
>>>> Thanks
>>>> Ralph
>>>>
>>>>
>>>>
>>>> On 5/29/07 9:23 AM, "George Bosilca" <bosilca_at_[hidden]> wrote:
>>>>
>>>>> I'd be happy to commit the patch into the trunk. But after what
>>>>> happened last time, I'm more than cautious. If the community think
>>>>> the patch is worth having it, let me know and I'll push it in the
>>>>> trunk asap.
>>>>>
>>>>> Thanks,
>>>>> george.
>>>>>
>>>>> On May 29, 2007, at 10:56 AM, Tim Prins wrote:
>>>>>
>>>>>> I think both patches should be put in immediately. I have done
>>>>>> some
>>>>>> simple testing, and with 128 nodes of odin, with 1024 processes
>>>>>> running mpi hello, these decrease our running time from about
>>>>>> 14.2
>>>>>> seconds to 10.9 seconds. This is a significant decrease, and
>>>>>> as the
>>>>>> scale increases there should be increasing benefit.
>>>>>>
>>>>>> I'd be happy to commit these changes if no one objects.
>>>>>>
>>>>>> Tim
>>>>>>
>>>>>> On May 24, 2007, at 8:39 AM, Ralph H Castain wrote:
>>>>>>
>>>>>>> Thanks - I'll take a look at this (and the prior ones!) in
>>>>>>> the next
>>>>>>> couple
>>>>>>> of weeks when time permits and get back to you.
>>>>>>>
>>>>>>> Ralph
>>>>>>>
>>>>>>>
>>>>>>> On 5/23/07 1:11 PM, "George Bosilca" <bosilca_at_[hidden]> wrote:
>>>>>>>
>>>>>>>> Attached is another patch to the ORTE layer, more
>>>>>>>> specifically the
>>>>>>>> replica. The idea is to decrease the number of strcmp by
>>>>>>>> using a
>>>>>>>> small hash function before doing the strcmp. The hask key
>>>>>>>> for each
>>>>>>>> registry entry is computed when it is added to the registry.
>>>>>>>> When
>>>>>>>> we're doing a query, instead of comparing the 2 strings we
>>>>>>>> first
>>>>>>>> check if the hash key match, and if they do match then we
>>>>>>>> compare
>>>>>>>> the
>>>>>>>> 2 strings in order to make sure we eliminate collisions from
>>>>>>>> our
>>>>>>>> answers.
>>>>>>>>
>>>>>>>> There is some benefit in terms of performance. It's hardly
>>>>>>>> visible
>>>>>>>> for few processes, but it start showing up when the number of
>>>>>>>> processes increase. In fact the number of strcmp in the
>>>>>>>> trace file
>>>>>>>> drastically decrease. The main reason it works well, is because
>>>>>>>> most
>>>>>>>> of the keys start with basically the same chars (such as orte-
>>>>>>>> blahblah) which transform the strcmp on a loop over few chars.
>>>>>>>>
>>>>>>>> Ralph, please consider it for inclusion on the ORTE layer.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> george.
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> devel mailing list
>>>>>>>> devel_at_[hidden]
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> devel mailing list
>>>>>>> devel_at_[hidden]
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>
>>>>>> _______________________________________________
>>>>>> devel mailing list
>>>>>> devel_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> devel_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>>>
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel



  • application/pkcs7-signature attachment: smime.p7s