Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI devel] Quiet Time on Trunk - ORCA Integration
From: Josh Hursey (jjhursey_at_[hidden])
Date: 2012-06-27 12:20:42


This morning Jeff, Brian, Ralph, and I discussed the issues that
emerged with the ORCA branch. We have a clear path forward. It might
take a little while (few weeks) to finish the effort since I am going
to be transiting over the next month, but we will see once we get into
the code.

We will start working on the fixes immediately. There is a new
BitBucket branch for this effort that you can watch at:
  https://bitbucket.org/jjhursey/ompi-orca

(The branch will be filled in later today)

We'll post back when it is ready for further review.

-- Josh

On Tue, Jun 26, 2012 at 9:29 PM, Josh Hursey <jjhursey_at_[hidden]> wrote:
> ORCA was backed out of the trunk in r26676.
>
> Once we fix the linking issue, we will bring this back.
>
> Sorry for the noise folks. The trunk is open again.
>
> -- Josh
>
> On Tue, Jun 26, 2012 at 9:04 PM, Josh Hursey <jjhursey_at_[hidden]> wrote:
>> So I'm spinning my wheels on this one. I am going to need someone with
>> more knowledge about linking to help (Jeff or Brian maybe?).
>>
>> I'll see what I can do to back this out of the trunk :(
>>
>> I wish this issue would have come up earlier - I just don't ever build
>> on my mac, so I never saw it.
>>
>> -- Josh
>>
>> On Tue, Jun 26, 2012 at 8:25 PM, Josh Hursey <jjhursey_at_[hidden]> wrote:
>>> So I can confirm that it is not linking properly on the Mac. It -is-
>>> running correctly on Linux (which is where I have been testing).
>>>
>>> From what I can tell this is a linking issue specific to the Mac. I'm
>>> digging into it a bit at the moment.
>>>
>>> Interesting way to tell is by using ompi_info as a canary. If you
>>> compile with (what is default):
>>>  ../../../ompi/libmpi.la -L/Users/jju/work/open-mpi/ompi-trunk/orte/ -lopen-rte
>>> It will display components from OMPI,ORCA,OPAL
>>> If you change that too:
>>>  ../../../ompi/libmpi.la  ../../../orte/libopen-rte.la
>>> you get the same thing, but if you change it to:
>>>  ./../../ompi/libmpi.la  ../../../orte/libopen-rte.la ./../../ompi/libmpi.la
>>> Then it only displays the ORTE,OPAL components.
>>>
>>> So I am thinking that this is an issue with using two .la's in a
>>> single linking - something that is not showing up on Linux.
>>>
>>> Any pointers on what might be going on here would be appreciated as I
>>> dig further.
>>>
>>> -- Josh
>>>
>>>
>>> On Tue, Jun 26, 2012 at 7:40 PM, Josh Hursey <jjhursey_at_[hidden]> wrote:
>>>> That is odd. I did not see that when testing on Linux. I'll take a look.
>>>>
>>>> -- josh
>>>>
>>>> On Tue, Jun 26, 2012 at 7:37 PM, Ralph Castain <rhc_at_[hidden]> wrote:
>>>>> FWIW: it built fine on my Mac, but doesn't run. Just hangs when attempting to execute any MPI application. Will execute non-MPI apps though.
>>>>>
>>>>> Looking deeper, it looks like the app is stuck waiting to receive the sync registration "ack". Mpirun thinks it sent it, but the app is unable to receive it, so I would guess the app is failing to progress the RTE.
>>>>>
>>>>> If you can't resolve that quickly, I would suggest backing this out and the two of us looking at it in the morning. Somehow, you've lost the progress loop that was driving the RTE thru orte_init.
>>>>>
>>>>>
>>>>> On Jun 26, 2012, at 3:44 PM, Josh Hursey wrote:
>>>>>
>>>>>> r26670 is the first of the ORCA commits. I am switching machines for
>>>>>> testing. Hang on for a couple more hours while the initial testing is
>>>>>> underway.
>>>>>>
>>>>>> -- Josh
>>>>>>
>>>>>> On Tue, Jun 26, 2012 at 4:34 PM, Josh Hursey <jjhursey_at_[hidden]> wrote:
>>>>>>> I am requesting a quiet time on the trunk for ORCA integration
>>>>>>> starting -now- (as previously announced). I will post back when
>>>>>>> everything is committed and ready to go.
>>>>>>>
>>>>>>> Some reading while you are waiting:
>>>>>>>  http://www.open-mpi.org/community/lists/devel/2012/06/11109.php
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Josh
>>>>>>>
>>>>>>> --
>>>>>>> Joshua Hursey
>>>>>>> Postdoctoral Research Associate
>>>>>>> Oak Ridge National Laboratory
>>>>>>> http://users.nccs.gov/~jjhursey
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Joshua Hursey
>>>>>> Postdoctoral Research Associate
>>>>>> Oak Ridge National Laboratory
>>>>>> http://users.nccs.gov/~jjhursey
>>>>>>
>>>>>> _______________________________________________
>>>>>> devel mailing list
>>>>>> devel_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> devel_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>>>
>>>>
>>>> --
>>>> Joshua Hursey
>>>> Postdoctoral Research Associate
>>>> Oak Ridge National Laboratory
>>>> http://users.nccs.gov/~jjhursey
>>>
>>>
>>>
>>> --
>>> Joshua Hursey
>>> Postdoctoral Research Associate
>>> Oak Ridge National Laboratory
>>> http://users.nccs.gov/~jjhursey
>>
>>
>>
>> --
>> Joshua Hursey
>> Postdoctoral Research Associate
>> Oak Ridge National Laboratory
>> http://users.nccs.gov/~jjhursey
>
>
>
> --
> Joshua Hursey
> Postdoctoral Research Associate
> Oak Ridge National Laboratory
> http://users.nccs.gov/~jjhursey

-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey