Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Heads up on new feature to 1.3.4
From: Chris Samuel (csamuel_at_[hidden])
Date: 2009-08-18 02:11:45


----- "Eugene Loh" <Eugene.Loh_at_[hidden]> wrote:

> Ah, you're missing the third secret safety switch that prevents
> hapless mortals from using this stuff accidentally! :^)

Sounds good to me. :-)
 
> I think you need to add
>
> --mca opal_paffinity_alone 1

Yup, looks like that's it; it fails to launch with that..

$ mpiexec --mca opal_paffinity_alone 1 -bysocket -bind-to-socket -mca odls_base_report_bindings 99 -mca odls_base_verbose 7 ./cpi-1.4
[tango095.vpac.org:18548] mca:base:select:( odls) Querying component [default]
[tango095.vpac.org:18548] mca:base:select:( odls) Query of component [default] set priority to 1
[tango095.vpac.org:18548] mca:base:select:( odls) Selected component [default]
[tango095.vpac.org:18548] [[33990,0],0] odls:launch: spawning child [[33990,1],0]
[tango095.vpac.org:18548] [[33990,0],0] odls:launch: spawning child [[33990,1],1]
[tango095.vpac.org:18548] [[33990,0],0] odls:default:fork binding child [[33990,1],0] to socket 0 cpus 000f
[tango095.vpac.org:18548] [[33990,0],0] odls:default:fork binding child [[33990,1],1] to socket 1 cpus 00f0
--------------------------------------------------------------------------
An attempt to set processor affinity has failed - please check to
ensure that your system supports such functionality. If so, then
this is probably something that should be reported to the OMPI developers.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpiexec was unable to start the specified application as it encountered an error
on node tango095.vpac.org. More information may be available above.
--------------------------------------------------------------------------
4 total processes failed to start

This is most likely because it's getting an error from the
kernel when trying to bind to a socket it's not permitted
to access.

cheers,
Chris

-- 
Christopher Samuel - (03) 9925 4751 - Systems Manager
 The Victorian Partnership for Advanced Computing
 P.O. Box 201, Carlton South, VIC 3053, Australia
VPAC is a not-for-profit Registered Research Agency