I've also had someone run into the endpoint busy problem. I never figured it out, I just increased the default endpoints on MX-10G from 8 to 16 endpoints to make the problem go away. Here's the actual command and error before setting the endpoints to 16. The version is MX-1.2.1with OMPI 1.2.3:

node1:~/taepic tae$ mpirun --hostfile hostmx10g -byslot -mca btl self,sm,mx -np 12 test_beam_injection test_beam_injection.inp -npx 12 > out12
[node2:00834] mca_btl_mx_init: mx_open_endpoint() failed with status=20
--------------------------------------------------------------------------
Process 0.1.3 is unable to reach 0.1.7 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of
usable components.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
Process 0.1.11 is unable to reach 0.1.7 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of
usable components.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

 PML add procs failed
 --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (goodbye)
--------------------------------------------------------------------------


Warner Yuen

Scientific Computing Consultant

Apple Computer

email: wyuen@apple.com

Tel: 408.718.2859

Fax: 408.715.0133



On Jul 10, 2007, at 7:53 AM, users-request@open-mpi.org wrote:

------------------------------


Message: 2

Date: Tue, 10 Jul 2007 09:19:42 -0400

From: Tim Prins <tprins@open-mpi.org>

Subject: Re: [OMPI users] openmpi fails on mx endpoint busy

To: Open MPI Users <users@open-mpi.org>

Message-ID: <4693876E.4070302@open-mpi.org>

Content-Type: text/plain; charset=ISO-8859-1; format=flowed


SLIM H.A. wrote:

Dear Tim


So, you should just be able to run:

mpirun --mca btl mx,sm,self -mca mtl ^mx -np 4 -hostfile 

ompi_machinefile ./cpi


I tried  


node001>mpirun --mca btl mx,sm,self -mca mtl ^mx -np 4 -hostfile

ompi_machinefile ./cpi


I put in a sleep call to keep it running for some time and to monitor

the endpoints. None of the 4 were open, it must have used tcp. 

No, this is not possible. With this command line it will not use tcp. 

Are you launching on more than one machine? If the procs are all on one 

machine, then it will use the shared memory component to communicate 

(sm), although the endpoints should still be opened.


Just to make sure, you did put the sleep between MPI_Init and MPI_Finalize?