Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Problem with MPI_Comm_accept in a dynamic client/server application
From: Kalin Kanov (kalin_at_[hidden])
Date: 2010-11-29 18:32:07


Hi Shiqing,

I must have missed your response among all the e-mails that get sent to
the mailing list. Here are a little more details about the issues that I
am having. My client/server programs seem to run sometimes, but then
after a successful run I always seem to get the error that I included in
my first post. The way that I run the programs is by running the server
application first, which generates the port string, etc. I then proceed
to run the client application with a new call to mpirun. After getting
the errors that I e-mailed about I also tried to run ompi-clean, but the
results are the following:

>ompi-clean
[Lazar:05984] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file
..\..\orte\r
untime\orte_init.c at line 125
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

   orte_ess_base_select failed
   --> Returned value Not found (-13) instead of ORTE_SUCCESS
--------------------------------------------------------------------------

Any help with this issue will be greatly appreciated.

Thank you,
Kalin

On 27.10.2010 г. 05:52, Shiqing Fan wrote:
> Hi Kalin,
>
> Sorry for the late reply.
>
> I checked the code and got confused. (I'm not and MPI expert) I'm just
> wondering how to start the server and client in the same mpirun command
> while the client needs a hand-input port name, which is given by the
> server at runtime.
>
> I found a similar program on the Internet (see attached), that works
> well on my Windows. In this program, the generated port name will be
> send among the processes by MPI_Send.
>
>
> Regards,
> Shiqing
>
>
> On 2010-10-13 11:09 PM, Kalin Kanov wrote:
>> Hi there,
>>
>> I am trying to create a client/server application with OpenMPI, which
>> has been installed on a Windows machine, by following the instruction
>> (with CMake) in the README.WINDOWS file in the OpenMPI distribution
>> (version 1.4.2). I have ran other test application that compile file
>> under the Visual Studio 2008 Command Prompt. However I get the
>> following errors on the server side when accepting a new client that
>> is trying to connect:
>>
>> [Lazar:02716] [[47880,1],0] ORTE_ERROR_LOG: Not found in file
>> ..\..\orte\mca\grp
>> comm\base\grpcomm_base_allgather.c at line 222
>> [Lazar:02716] [[47880,1],0] ORTE_ERROR_LOG: Not found in file
>> ..\..\orte\mca\grp
>> comm\basic\grpcomm_basic_module.c at line 530
>> [Lazar:02716] [[47880,1],0] ORTE_ERROR_LOG: Not found in file
>> ..\..\ompi\mca\dpm
>> \orte\dpm_orte.c at line 363
>> [Lazar:2716] *** An error occurred in MPI_Comm_accept
>> [Lazar:2716] *** on communicator MPI_COMM_WORLD
>> [Lazar:2716] *** MPI_ERR_INTERN: internal error
>> [Lazar:2716] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
>> --------------------------------------------------------------------------
>>
>> mpirun has exited due to process rank 0 with PID 476 on
>> node Lazar exiting without calling "finalize". This may
>> have caused other processes in the application to be
>> terminated by signals sent by mpirun (as reported here).
>> --------------------------------------------------------------------------
>>
>>
>> The server and client code is attached. I have straggled with this
>> problem for quite a while, so please let me know what the issue might
>> be. I have looked at the archives and the FAQ, and the only thing
>> similar that I have found had to do with different version of OpenMPI
>> installed, but I only have one version, and I believe it is the one
>> being used.
>>
>> Thank you,
>> Kalin
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> --------------------------------------------------------------
> Shiqing Fanhttp://www.hlrs.de/people/fan
> High Performance Computing Tel.: +49 711 685 87234
> Center Stuttgart (HLRS) Fax.: +49 711 685 65832
> Address:Allmandring 30 email:fan_at_[hidden]
> 70569 Stuttgart
>