Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

From: George Bosilca (bosilca_at_[hidden])
Date: 2007-07-11 11:27:47


What's in the hostmx10g file ? How many hosts ?

   george.

On Jul 11, 2007, at 1:34 AM, Warner Yuen wrote:

> I've also had someone run into the endpoint busy problem. I never
> figured it out, I just increased the default endpoints on MX-10G
> from 8 to 16 endpoints to make the problem go away. Here's the
> actual command and error before setting the endpoints to 16. The
> version is MX-1.2.1with OMPI 1.2.3:
>
> node1:~/taepic tae$ mpirun --hostfile hostmx10g -byslot -mca btl
> self,sm,mx -np 12 test_beam_injection test_beam_injection.inp -npx
> 12 > out12
> [node2:00834] mca_btl_mx_init: mx_open_endpoint() failed with
> status=20
> ----------------------------------------------------------------------
> ----
> Process 0.1.3 is unable to reach 0.1.7 for MPI communication.
> If you specified the use of a BTL component, you may have
> forgotten a component (such as "self") in the list of
> usable components.
> ----------------------------------------------------------------------
> ----
> ----------------------------------------------------------------------
> ----
> Process 0.1.11 is unable to reach 0.1.7 for MPI communication.
> If you specified the use of a BTL component, you may have
> forgotten a component (such as "self") in the list of
> usable components.
> ----------------------------------------------------------------------
> ----
> ----------------------------------------------------------------------
> ----
> It looks like MPI_INIT failed for some reason; your parallel
> process is
> likely to abort. There are many reasons that a parallel process can
> fail during MPI_INIT; some of which are due to configuration or
> environment
> problems. This failure appears to be an internal failure; here's some
> additional information (which may only be relevant to an Open MPI
> developer):
>
> PML add procs failed
> --> Returned "Unreachable" (-12) instead of "Success" (0)
> ----------------------------------------------------------------------
> ----
> *** An error occurred in MPI_Init
> *** before MPI was initialized
> *** MPI_ERRORS_ARE_FATAL (goodbye)
> ----------------------------------------------------------------------
> ----
>
>
> Warner Yuen
> Scientific Computing Consultant
> Apple Computer
> email: wyuen_at_[hidden]
> Tel: 408.718.2859
> Fax: 408.715.0133
>
>
> On Jul 10, 2007, at 7:53 AM, users-request_at_[hidden] wrote:
>
>> ------------------------------
>>
>> Message: 2
>> Date: Tue, 10 Jul 2007 09:19:42 -0400
>> From: Tim Prins <tprins_at_[hidden]>
>> Subject: Re: [OMPI users] openmpi fails on mx endpoint busy
>> To: Open MPI Users <users_at_[hidden]>
>> Message-ID: <4693876E.4070302_at_[hidden]>
>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>>
>> SLIM H.A. wrote:
>>> Dear Tim
>>>
>>>> So, you should just be able to run:
>>>> mpirun --mca btl mx,sm,self -mca mtl ^mx -np 4 -hostfile
>>>> ompi_machinefile ./cpi
>>>
>>> I tried
>>>
>>> node001>mpirun --mca btl mx,sm,self -mca mtl ^mx -np 4 -hostfile
>>> ompi_machinefile ./cpi
>>>
>>> I put in a sleep call to keep it running for some time and to
>>> monitor
>>> the endpoints. None of the 4 were open, it must have used tcp.
>> No, this is not possible. With this command line it will not use tcp.
>> Are you launching on more than one machine? If the procs are all
>> on one
>> machine, then it will use the shared memory component to communicate
>> (sm), although the endpoints should still be opened.
>>
>> Just to make sure, you did put the sleep between MPI_Init and
>> MPI_Finalize?
>>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users