Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Mac OSX 10.6 (SL) + openMPI 1.3.3 + Intel Compilers11.1.076
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2009-11-05 08:26:48


I'm afraid that Open MPI v1.3.x's xgrid support is currently broken --
we haven't had anyone with the knowledge or experience available to
fix it. :-( Patches would be welcome...

Note that Open MPI itself works fine on Snow Leopard -- it's just the
xgrid launching support that is broken.

On Nov 5, 2009, at 7:18 AM, Christophe Peyret wrote:

> Hello,
>
> I'm trying to launch a job with mpirun on my Mac Pro and I have a
> strange error message,
> any idea ?
>
> Christophe
>
>
> [santafe.onera:00235] orte:plm:xgrid: Connection to XGrid controller
> unexpectedly closed: (600) The operation couldn’t be completed.
> (BEEP error 600.)
> 2009-11-05 13:13:53.973 orted[235:903] *** Terminating app due to
> uncaught exception 'NSInvalidArgumentException', reason: '*** -
> [XGConnection<0x100224df0> finalize]: called when collecting not
> enabled'
> *** Call stack at first throw:
> (
> 0 CoreFoundation 0x00007fff8712c5a4
> __exceptionPreprocess + 180
> 1 libobjc.A.dylib 0x00007fff87b8d313
> objc_exception_throw + 45
> 2 CoreFoundation 0x00007fff87147251 -
> [NSObject(NSObject) finalize] + 129
> 3 mca_plm_xgrid.so 0x0000000100149720 -
> [PlmXGridClient dealloc] + 64
> 4 mca_plm_xgrid.so 0x00000001001480e0
> orte_plm_xgrid_finalize + 64
> 5 mca_plm_xgrid.so 0x0000000100147fa1
> orte_plm_xgrid_component_query + 529
> 6 libopen-pal.0.dylib 0x00000001000811ea
> mca_base_select + 186
> )
> terminate called after throwing an instance of 'NSException'
> [santafe:00235] *** Process received signal ***
> [santafe:00235] Signal: Abort trap (6)
> [santafe:00235] Signal code: (0)
> [santafe:00235] *** End of error message ***
> [santafe.onera:00233] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to
> start a daemon on the local node in file ess_singleton_module.c at
> line 381
> [santafe.onera:00233] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to
> start a daemon on the local node in file ess_singleton_module.c at
> line 143
> [santafe.onera:00233] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to
> start a daemon on the local node in file runtime/orte_init.c at line
> 132
> --------------------------------------------------------------------------
> It looks like orte_init failed for some reason; your parallel
> process is
> likely to abort. There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems. This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
>
> orte_ess_set_name failed
> --> Returned value Unable to start a daemon on the local node
> (-128) instead of ORTE_SUCCESS
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> It looks like MPI_INIT failed for some reason; your parallel process
> is
> likely to abort. There are many reasons that a parallel process can
> fail during MPI_INIT; some of which are due to configuration or
> environment
> problems. This failure appears to be an internal failure; here's some
> additional information (which may only be relevant to an Open MPI
> developer):
>
> ompi_mpi_init: orte_init failed
> --> Returned "Unable to start a daemon on the local node" (-128)
> instead of "Success" (0)
> --------------------------------------------------------------------------
> *** An error occurred in MPI_Init
> *** before MPI was initialized
> *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
> [santafe.onera:233] Abort before MPI_INIT completed successfully;
> not able to guarantee that all other processes were killed!
> santafe:Example peyret$
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
jsquyres_at_[hidden]