Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Quiet Time on Trunk - ORCA Integration
From: Ralph Castain (rhc_at_[hidden])
Date: 2012-06-26 21:56:53


Sorry Josh - BTW, I was unable to build on odin (Linux, slurm) due to some errors in one of the OMPI components. However, it would be easily fixable.

I'll get in touch on Wed morning to help fix this. I apologize - I should have tested your branch in advance. :-(

On Jun 26, 2012, at 7:29 PM, Josh Hursey wrote:

> ORCA was backed out of the trunk in r26676.
>
> Once we fix the linking issue, we will bring this back.
>
> Sorry for the noise folks. The trunk is open again.
>
> -- Josh
>
> On Tue, Jun 26, 2012 at 9:04 PM, Josh Hursey <jjhursey_at_[hidden]> wrote:
>> So I'm spinning my wheels on this one. I am going to need someone with
>> more knowledge about linking to help (Jeff or Brian maybe?).
>>
>> I'll see what I can do to back this out of the trunk :(
>>
>> I wish this issue would have come up earlier - I just don't ever build
>> on my mac, so I never saw it.
>>
>> -- Josh
>>
>> On Tue, Jun 26, 2012 at 8:25 PM, Josh Hursey <jjhursey_at_[hidden]> wrote:
>>> So I can confirm that it is not linking properly on the Mac. It -is-
>>> running correctly on Linux (which is where I have been testing).
>>>
>>> From what I can tell this is a linking issue specific to the Mac. I'm
>>> digging into it a bit at the moment.
>>>
>>> Interesting way to tell is by using ompi_info as a canary. If you
>>> compile with (what is default):
>>> ../../../ompi/libmpi.la -L/Users/jju/work/open-mpi/ompi-trunk/orte/ -lopen-rte
>>> It will display components from OMPI,ORCA,OPAL
>>> If you change that too:
>>> ../../../ompi/libmpi.la ../../../orte/libopen-rte.la
>>> you get the same thing, but if you change it to:
>>> ./../../ompi/libmpi.la ../../../orte/libopen-rte.la ./../../ompi/libmpi.la
>>> Then it only displays the ORTE,OPAL components.
>>>
>>> So I am thinking that this is an issue with using two .la's in a
>>> single linking - something that is not showing up on Linux.
>>>
>>> Any pointers on what might be going on here would be appreciated as I
>>> dig further.
>>>
>>> -- Josh
>>>
>>>
>>> On Tue, Jun 26, 2012 at 7:40 PM, Josh Hursey <jjhursey_at_[hidden]> wrote:
>>>> That is odd. I did not see that when testing on Linux. I'll take a look.
>>>>
>>>> -- josh
>>>>
>>>> On Tue, Jun 26, 2012 at 7:37 PM, Ralph Castain <rhc_at_[hidden]> wrote:
>>>>> FWIW: it built fine on my Mac, but doesn't run. Just hangs when attempting to execute any MPI application. Will execute non-MPI apps though.
>>>>>
>>>>> Looking deeper, it looks like the app is stuck waiting to receive the sync registration "ack". Mpirun thinks it sent it, but the app is unable to receive it, so I would guess the app is failing to progress the RTE.
>>>>>
>>>>> If you can't resolve that quickly, I would suggest backing this out and the two of us looking at it in the morning. Somehow, you've lost the progress loop that was driving the RTE thru orte_init.
>>>>>
>>>>>
>>>>> On Jun 26, 2012, at 3:44 PM, Josh Hursey wrote:
>>>>>
>>>>>> r26670 is the first of the ORCA commits. I am switching machines for
>>>>>> testing. Hang on for a couple more hours while the initial testing is
>>>>>> underway.
>>>>>>
>>>>>> -- Josh
>>>>>>
>>>>>> On Tue, Jun 26, 2012 at 4:34 PM, Josh Hursey <jjhursey_at_[hidden]> wrote:
>>>>>>> I am requesting a quiet time on the trunk for ORCA integration
>>>>>>> starting -now- (as previously announced). I will post back when
>>>>>>> everything is committed and ready to go.
>>>>>>>
>>>>>>> Some reading while you are waiting:
>>>>>>> http://www.open-mpi.org/community/lists/devel/2012/06/11109.php
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Josh
>>>>>>>
>>>>>>> --
>>>>>>> Joshua Hursey
>>>>>>> Postdoctoral Research Associate
>>>>>>> Oak Ridge National Laboratory
>>>>>>> http://users.nccs.gov/~jjhursey
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Joshua Hursey
>>>>>> Postdoctoral Research Associate
>>>>>> Oak Ridge National Laboratory
>>>>>> http://users.nccs.gov/~jjhursey
>>>>>>
>>>>>> _______________________________________________
>>>>>> devel mailing list
>>>>>> devel_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> devel_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>>>
>>>>
>>>> --
>>>> Joshua Hursey
>>>> Postdoctoral Research Associate
>>>> Oak Ridge National Laboratory
>>>> http://users.nccs.gov/~jjhursey
>>>
>>>
>>>
>>> --
>>> Joshua Hursey
>>> Postdoctoral Research Associate
>>> Oak Ridge National Laboratory
>>> http://users.nccs.gov/~jjhursey
>>
>>
>>
>> --
>> Joshua Hursey
>> Postdoctoral Research Associate
>> Oak Ridge National Laboratory
>> http://users.nccs.gov/~jjhursey
>
>
>
> --
> Joshua Hursey
> Postdoctoral Research Associate
> Oak Ridge National Laboratory
> http://users.nccs.gov/~jjhursey
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel