Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Quiet Time on Trunk - ORCA Integration
From: Josh Hursey (jjhursey_at_[hidden])
Date: 2012-06-27 12:20:42


This morning Jeff, Brian, Ralph, and I discussed the issues that
emerged with the ORCA branch. We have a clear path forward. It might
take a little while (few weeks) to finish the effort since I am going
to be transiting over the next month, but we will see once we get into
the code.

We will start working on the fixes immediately. There is a new
BitBucket branch for this effort that you can watch at:
  https://bitbucket.org/jjhursey/ompi-orca

(The branch will be filled in later today)

We'll post back when it is ready for further review.

-- Josh

On Tue, Jun 26, 2012 at 9:29 PM, Josh Hursey <jjhursey_at_[hidden]> wrote:
> ORCA was backed out of the trunk in r26676.
>
> Once we fix the linking issue, we will bring this back.
>
> Sorry for the noise folks. The trunk is open again.
>
> -- Josh
>
> On Tue, Jun 26, 2012 at 9:04 PM, Josh Hursey <jjhursey_at_[hidden]> wrote:
>> So I'm spinning my wheels on this one. I am going to need someone with
>> more knowledge about linking to help (Jeff or Brian maybe?).
>>
>> I'll see what I can do to back this out of the trunk :(
>>
>> I wish this issue would have come up earlier - I just don't ever build
>> on my mac, so I never saw it.
>>
>> -- Josh
>>
>> On Tue, Jun 26, 2012 at 8:25 PM, Josh Hursey <jjhursey_at_[hidden]> wrote:
>>> So I can confirm that it is not linking properly on the Mac. It -is-
>>> running correctly on Linux (which is where I have been testing).
>>>
>>> From what I can tell this is a linking issue specific to the Mac. I'm
>>> digging into it a bit at the moment.
>>>
>>> Interesting way to tell is by using ompi_info as a canary. If you
>>> compile with (what is default):
>>>  ../../../ompi/libmpi.la -L/Users/jju/work/open-mpi/ompi-trunk/orte/ -lopen-rte
>>> It will display components from OMPI,ORCA,OPAL
>>> If you change that too:
>>>  ../../../ompi/libmpi.la  ../../../orte/libopen-rte.la
>>> you get the same thing, but if you change it to:
>>>  ./../../ompi/libmpi.la  ../../../orte/libopen-rte.la ./../../ompi/libmpi.la
>>> Then it only displays the ORTE,OPAL components.
>>>
>>> So I am thinking that this is an issue with using two .la's in a
>>> single linking - something that is not showing up on Linux.
>>>
>>> Any pointers on what might be going on here would be appreciated as I
>>> dig further.
>>>
>>> -- Josh
>>>
>>>
>>> On Tue, Jun 26, 2012 at 7:40 PM, Josh Hursey <jjhursey_at_[hidden]> wrote:
>>>> That is odd. I did not see that when testing on Linux. I'll take a look.
>>>>
>>>> -- josh
>>>>
>>>> On Tue, Jun 26, 2012 at 7:37 PM, Ralph Castain <rhc_at_[hidden]> wrote:
>>>>> FWIW: it built fine on my Mac, but doesn't run. Just hangs when attempting to execute any MPI application. Will execute non-MPI apps though.
>>>>>
>>>>> Looking deeper, it looks like the app is stuck waiting to receive the sync registration "ack". Mpirun thinks it sent it, but the app is unable to receive it, so I would guess the app is failing to progress the RTE.
>>>>>
>>>>> If you can't resolve that quickly, I would suggest backing this out and the two of us looking at it in the morning. Somehow, you've lost the progress loop that was driving the RTE thru orte_init.
>>>>>
>>>>>
>>>>> On Jun 26, 2012, at 3:44 PM, Josh Hursey wrote:
>>>>>
>>>>>> r26670 is the first of the ORCA commits. I am switching machines for
>>>>>> testing. Hang on for a couple more hours while the initial testing is
>>>>>> underway.
>>>>>>
>>>>>> -- Josh
>>>>>>
>>>>>> On Tue, Jun 26, 2012 at 4:34 PM, Josh Hursey <jjhursey_at_[hidden]> wrote:
>>>>>>> I am requesting a quiet time on the trunk for ORCA integration
>>>>>>> starting -now- (as previously announced). I will post back when
>>>>>>> everything is committed and ready to go.
>>>>>>>
>>>>>>> Some reading while you are waiting:
>>>>>>>  http://www.open-mpi.org/community/lists/devel/2012/06/11109.php
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Josh
>>>>>>>
>>>>>>> --
>>>>>>> Joshua Hursey
>>>>>>> Postdoctoral Research Associate
>>>>>>> Oak Ridge National Laboratory
>>>>>>> http://users.nccs.gov/~jjhursey
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Joshua Hursey
>>>>>> Postdoctoral Research Associate
>>>>>> Oak Ridge National Laboratory
>>>>>> http://users.nccs.gov/~jjhursey
>>>>>>
>>>>>> _______________________________________________
>>>>>> devel mailing list
>>>>>> devel_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> devel_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>>>
>>>>
>>>> --
>>>> Joshua Hursey
>>>> Postdoctoral Research Associate
>>>> Oak Ridge National Laboratory
>>>> http://users.nccs.gov/~jjhursey
>>>
>>>
>>>
>>> --
>>> Joshua Hursey
>>> Postdoctoral Research Associate
>>> Oak Ridge National Laboratory
>>> http://users.nccs.gov/~jjhursey
>>
>>
>>
>> --
>> Joshua Hursey
>> Postdoctoral Research Associate
>> Oak Ridge National Laboratory
>> http://users.nccs.gov/~jjhursey
>
>
>
> --
> Joshua Hursey
> Postdoctoral Research Associate
> Oak Ridge National Laboratory
> http://users.nccs.gov/~jjhursey

-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey