Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Problem with MPI_Comm_accept in a dynamic client/server application
From: Shiqing Fan (fan_at_[hidden])
Date: 2010-12-01 08:42:08


Hi Kalin,

Which version of Open MPI did you use? It seems that the ess component
couldn't be selected. Could you please send me the output of ompi_info?

Regards,
Shiqing

On 2010-11-30 12:32 AM, Kalin Kanov wrote:
> Hi Shiqing,
>
> I must have missed your response among all the e-mails that get sent
> to the mailing list. Here are a little more details about the issues
> that I am having. My client/server programs seem to run sometimes, but
> then after a successful run I always seem to get the error that I
> included in my first post. The way that I run the programs is by
> running the server application first, which generates the port string,
> etc. I then proceed to run the client application with a new call to
> mpirun. After getting the errors that I e-mailed about I also tried to
> run ompi-clean, but the results are the following:
>
> >ompi-clean
> [Lazar:05984] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file
> ..\..\orte\r
> untime\orte_init.c at line 125
> --------------------------------------------------------------------------
>
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort. There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems. This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
>
> orte_ess_base_select failed
> --> Returned value Not found (-13) instead of ORTE_SUCCESS
> --------------------------------------------------------------------------
>
>
> Any help with this issue will be greatly appreciated.
>
> Thank you,
> Kalin
>
>
> On 27.10.2010 г. 05:52, Shiqing Fan wrote:
>> Hi Kalin,
>>
>> Sorry for the late reply.
>>
>> I checked the code and got confused. (I'm not and MPI expert) I'm just
>> wondering how to start the server and client in the same mpirun command
>> while the client needs a hand-input port name, which is given by the
>> server at runtime.
>>
>> I found a similar program on the Internet (see attached), that works
>> well on my Windows. In this program, the generated port name will be
>> send among the processes by MPI_Send.
>>
>>
>> Regards,
>> Shiqing
>>
>>
>> On 2010-10-13 11:09 PM, Kalin Kanov wrote:
>>> Hi there,
>>>
>>> I am trying to create a client/server application with OpenMPI, which
>>> has been installed on a Windows machine, by following the instruction
>>> (with CMake) in the README.WINDOWS file in the OpenMPI distribution
>>> (version 1.4.2). I have ran other test application that compile file
>>> under the Visual Studio 2008 Command Prompt. However I get the
>>> following errors on the server side when accepting a new client that
>>> is trying to connect:
>>>
>>> [Lazar:02716] [[47880,1],0] ORTE_ERROR_LOG: Not found in file
>>> ..\..\orte\mca\grp
>>> comm\base\grpcomm_base_allgather.c at line 222
>>> [Lazar:02716] [[47880,1],0] ORTE_ERROR_LOG: Not found in file
>>> ..\..\orte\mca\grp
>>> comm\basic\grpcomm_basic_module.c at line 530
>>> [Lazar:02716] [[47880,1],0] ORTE_ERROR_LOG: Not found in file
>>> ..\..\ompi\mca\dpm
>>> \orte\dpm_orte.c at line 363
>>> [Lazar:2716] *** An error occurred in MPI_Comm_accept
>>> [Lazar:2716] *** on communicator MPI_COMM_WORLD
>>> [Lazar:2716] *** MPI_ERR_INTERN: internal error
>>> [Lazar:2716] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
>>> --------------------------------------------------------------------------
>>>
>>>
>>> mpirun has exited due to process rank 0 with PID 476 on
>>> node Lazar exiting without calling "finalize". This may
>>> have caused other processes in the application to be
>>> terminated by signals sent by mpirun (as reported here).
>>> --------------------------------------------------------------------------
>>>
>>>
>>>
>>> The server and client code is attached. I have straggled with this
>>> problem for quite a while, so please let me know what the issue might
>>> be. I have looked at the archives and the FAQ, and the only thing
>>> similar that I have found had to do with different version of OpenMPI
>>> installed, but I only have one version, and I believe it is the one
>>> being used.
>>>
>>> Thank you,
>>> Kalin
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> --
>> --------------------------------------------------------------
>> Shiqing Fanhttp://www.hlrs.de/people/fan
>> High Performance Computing Tel.: +49 711 685 87234
>> Center Stuttgart (HLRS) Fax.: +49 711 685 65832
>> Address:Allmandring 30 email:fan_at_[hidden]
>> 70569 Stuttgart
>>
>

-- 
--------------------------------------------------------------
Shiqing Fan                          http://www.hlrs.de/people/fan
High Performance Computing           Tel.: +49 711 685 87234
   Center Stuttgart (HLRS)            Fax.: +49 711 685 65832
Address:Allmandring 30               email: fan_at_[hidden]
70569 Stuttgart