Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] 1.7.4rc1 run failure on Solaris 10 / SPARC (not SIGBUS)
From: Paul Hargrove (phhargrove_at_[hidden])
Date: 2013-12-19 22:39:57


Ralph,

I can confirm "--bind-to none" worked to eliminate the error, but the test
now appears to hang :-(

Since you say the binding probably fixed for rc2, I'll see if the latest
nightly tarball works better by default.

-Paul

On Thu, Dec 19, 2013 at 7:19 PM, Ralph Castain <rhc_at_[hidden]> wrote:

> I believe this one has already been fixed and is in the nightly (1.7.4rc2)
> - for now, you can just set "--bind-to none" on the cmd line to get past it
>
>
> On Dec 19, 2013, at 6:42 PM, Paul Hargrove <phhargrove_at_[hidden]> wrote:
>
> Testing with Solaris 10 on SPARC, I was expecting to encounter the bus
> error reported previously by Siegman Gross. Instead I see the following
> hwloc-related abort:
>
> $ env
> PATH=/home/hargrove/OMPI/openmpi-1.7.4rc1-solaris10-sparcT2-ss12u3-v9/INST/bin:$PATH
> LD_LIBRARY_PATH_64=/home/hargrove/OMPI/openmpi-1.7.4rc1-solaris10-sparcT2-ss12u3-v9/INST/lib:$LD_LIBRARY_PATH_64
> OMPI_MCA_shmem_mmap_enable_nfs_warning=0
> /home/hargrove/OMPI/openmpi-1.7.4rc1-solaris10-sparcT2-ss12u3-v9/INST/bin/mpirun
> -mca btl sm,self -np 2 examples/ring_c
> --------------------------------------------------------------------------
> Open MPI tried to bind a new process, but something went wrong. The
> process was killed without launching the target application. Your job
> will now abort.
>
> Local host: niagara1
> Application name: examples/ring_c
> Error message: hwloc indicates cpu binding cannot be enforced
> Location:
> /home/hargrove/OMPI/openmpi-1.7.4rc1-solaris10-sparcT2-ss12u3-v9/openmpi-1.7.4rc1/orte/mca/odls/default/odls_default_module.c:478
> --------------------------------------------------------------------------
> 2 total processes failed to start
>
>
> I am assuming I just need some magic pixie dust to disable cpu binding.
> I'd appreciate some corresponding instructions.
>
> However, if this is NOT an expected/desired/known behavior please let me
> know what I can/should do to help determine the root cause.
>
>
> -Paul
>
> --
> Paul H. Hargrove PHHargrove_at_[hidden]
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

-- 
Paul H. Hargrove                          PHHargrove_at_[hidden]
Future Technologies Group
Computer and Data Sciences Department     Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900