Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI devel] 1.7.4rc2r30148 run failure NetBSD6-x86
From: Ralph Castain (rhc_at_[hidden])
Date: 2014-01-08 23:50:17


Hmmm...looks to me like the code should protect against this - unless the system isn't correctly reporting binding support. Could you run this with "-mca ess_base_verbose 10"? This will output the topology we found, including the binding support (which isn't in the usual output).

On Jan 8, 2014, at 8:14 PM, Ralph Castain <rhc_at_[hidden]> wrote:

> Hmmm...I see the problem. Looks like binding isn't supported on that system for some reason, so we need to turn "off" our auto-binding when we hit that condition. I'll check to see why that isn't happening (was supposed to do so)
>
>
> On Jan 8, 2014, at 3:43 PM, Paul Hargrove <phhargrove_at_[hidden]> wrote:
>
>> While I have yet to get a working build on NetBSD for x86-64 h/w, I *have* successfully built Open MPI's current 1.7.4rc tarball on NetBSD-6 for x86. However, I can't *run* anything:
>>
>> Attempting the ring_c example on 2 cores:
>> -bash-4.2$ mpirun -mca btl sm,self -np 2 examples/ring_c
>> --------------------------------------------------------------------------
>> While computing bindings, we found no available cpus on
>> the following node:
>>
>> Node: pcp-j-17
>>
>> Please check your allocation.
>> --------------------------------------------------------------------------
>>
>> The failure is the same w/o "-mca btl sm,self"
>> Singleton runs fail just as the np=2 run did.
>>
>> I've attached compressed output from "ompi_info --all".
>>
>> Since this is probably an hwloc-related issue, I also build hwloc-1.7.2 from pristine sources.
>> I have attached compressed output of lstopo which NOTABLY indicates a failure to bind to both of the CPUs.
>>
>> For now, an explicit "--bind-to none" is working for me.
>> Please let me know what additional info may be required.
>>
>> -Paul
>>
>> --
>> Paul H. Hargrove PHHargrove_at_[hidden]
>> Future Technologies Group
>> Computer and Data Sciences Department Tel: +1-510-495-2352
>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>> <ompi_info-netbsd-x86.txt.bz2><lstopo172-netbsd-x86.txt.bz2>_______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>