Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] 1.7.4rc2r30168 - odd run failure
From: Ralph Castain (rhc_at_[hidden])
Date: 2014-01-10 13:08:26


??? that was it? Was this built with --enable-debug?

On Jan 10, 2014, at 10:03 AM, Paul Hargrove <phhargrove_at_[hidden]> wrote:

>
>
>
> On Fri, Jan 10, 2014 at 7:12 AM, Ralph Castain <rhc_at_[hidden]> wrote:
> Very strange. Try adding "-mca grpcomm_base_verbose 5 -mca orte_nidmap_verbose 10" to your cmd line with the trunk version and let's see what may be happening
>
> Most of my systems don't have new enough autotools to work from svn.
> If it is critical I could setup to rsync from one of my systems that *can* autogen.
>
> So, this is from last night's trunk tarball (1.9a1r30215):
>
> $ mpirun -mca grpcomm_base_verbose 5 -mca orte_nidmap_verbose 10 -np 1 examples/ring_c 2>&1 | tee log
> [cvrsvc01:29185] mca:base:select:(grpcomm) Querying component [bad]
> [cvrsvc01:29185] mca:base:select:(grpcomm) Query of component [bad] set priority to 10
> [cvrsvc01:29185] mca:base:select:(grpcomm) Selected component [bad]
> [cvrsvc01:29188] mca:base:select:(grpcomm) Querying component [bad]
> [cvrsvc01:29188] mca:base:select:(grpcomm) Query of component [bad] set priority to 10
> [cvrsvc01:29188] mca:base:select:(grpcomm) Selected component [bad]
> [cvrsvc01:29188] [[37720,1],0] ORTE_ERROR_LOG: Data for specified key not found in file /global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-trunk-linux-x86_64/openmpi-1.9a1r30215/orte/runtime/orte_globals.c at line 503
>
>
>
> Any chance of library confusion here?
>
> I just verified using /proc/<pid>/maps on the hung orterun and ring_c processes that the only shared libs mapped in are the systems ones in /lib64 and the ones from the fresh install of Open MPI. No stale libs from old OMPI builds.
>
> -Paul
>
>
>
> On Jan 9, 2014, at 9:57 PM, Paul Hargrove <phhargrove_at_[hidden]> wrote:
>
>> The problem is seen with both the trunk and the 1.7.4rc tarball.
>>
>> -Paul
>>
>>
>> On Thu, Jan 9, 2014 at 9:23 PM, Paul Hargrove <phhargrove_at_[hidden]> wrote:
>>
>> On Thu, Jan 9, 2014 at 8:56 PM, Paul Hargrove <phhargrove_at_[hidden]> wrote:
>> I'll try a gcc-based build on one of the systems ASAP.
>>
>> Sorry, Ralph: the failure remains when built w/ gcc.
>> Let me know what to try next and I'll give it a shot.
>>
>> -Paul
>>
>>
>> --
>> Paul H. Hargrove PHHargrove_at_[hidden]
>> Future Technologies Group
>> Computer and Data Sciences Department Tel: +1-510-495-2352
>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>>
>>
>>
>> --
>> Paul H. Hargrove PHHargrove_at_[hidden]
>> Future Technologies Group
>> Computer and Data Sciences Department Tel: +1-510-495-2352
>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
>
> --
> Paul H. Hargrove PHHargrove_at_[hidden]
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel