Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

From: Renato Golin (rengolin_at_[hidden])
Date: 2006-09-07 12:46:40


Hi Devel list, crossposting as this is getting weird...

Alfonso did a client/server using MPI_Publish_name / MPI_Lookup_name
and it runs fine on both MPICH2 and LAM-MPI but fail on Open MPI. It's
not a simple failure (ie. returning an error code) it breaks the
execution line and quits. The server continue to run after the
client's crash.

The server also use 100% of CPU while running, what doesn't happen with LAM.

The code is here:
http://www.systemcall.com.br/rengolin/open-mpi/

OpenMP version: 1.1.1

Compiling:
mpiCC -o server server.c
mpiCC -o client client.c
 - or -
mpiCC -o client client.c -DUSE_LOOKUP

Running & Output:
-- Server --
sbornia$ mpiexec server foo
server Process Rank 0 ,TOT processes 1 on sbornia
Server foo available at 0.1.0:2000

-- Client without USE_LOOKUP --
sbornia$ mpiexec client foo
Rank Client Process 0 ,TOT processes 1 on sbornia
[sbornia:06246] [0,1,0] ORTE_ERROR_LOG: Pack data mismatch in file
dss/dss_unpack.c at line 171
[sbornia:06246] [0,1,0] ORTE_ERROR_LOG: Pack data mismatch in file
dss/dss_unpack.c at line 145
[sbornia:06246] *** An error occurred in MPI_Comm_connect
[sbornia:06246] *** on communicator MPI_COMM_WORLD
[sbornia:06246] *** MPI_ERR_UNKNOWN: unknown error
[sbornia:06246] *** MPI_ERRORS_ARE_FATAL (goodbye)
[sbornia:06243] [0,0,0]-[0,1,0] mca_oob_tcp_msg_recv: readv failed
with errno=104

-- Client with USE_LOOKUP --
sbornia$ mpiexec client foo
Rank Client Process 0 ,TOT processes 1 on sbornia
[sbornia:06232] *** An error occurred in MPI_Lookup_name
[sbornia:06232] *** on communicator MPI_COMM_WORLD
[sbornia:06232] *** MPI_ERR_NAME: invalid name argument
[sbornia:06232] *** MPI_ERRORS_ARE_FATAL (goodbye)
[sbornia:06229] [0,0,0]-[0,1,0] mca_oob_tcp_msg_recv: readv failed
with errno=104

OS error code 104: Connection reset by peer

what are we doing wrong ?

thanks in advance!
--renato