Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2007-10-08 03:32:11


In last night's MTT, I got a bunch of errors in COMM_SPAWN. I know
we're expecting it to fail (possibly/probably due to IOF errors), but
this didn't appear to be what we expected. For simplicity, I
compiled the IBM test suite manually and ran the spawn test:

[0:30] svbu-mpi:~/svn/ompi-tests/ibm/dynamic % mpirun -np 3 spawn
[svbu-mpi001.cisco.com:02845] [1,1] ORTE_ERROR_LOG: Communication
failure in file grpcomm_basic_module.c at line 666
[svbu-mpi001.cisco.com:02845] [1,1] ORTE_ERROR_LOG: Communication
failure in file communicator/comm_dyn.c at line 274
[**ERROR**]: MPI_COMM_WORLD rank 1, file spawn.c:114:
ERROR: MPI_Comm_spawn returned errcode[0] = -112
[svbu-mpi001.cisco.com:02845] MPI_ABORT invoked on rank 1 in
communicator MPI_COMM_WORLD with errorcode 1
[svbu-mpi001.cisco.com:02846] [1,2] ORTE_ERROR_LOG: Communication
failure in file grpcomm_basic_module.c at line 666
[svbu-mpi001.cisco.com:02846] [1,2] ORTE_ERROR_LOG: Communication
failure in file communicator/comm_dyn.c at line 274
[**ERROR**]: MPI_COMM_WORLD rank 2, file spawn.c:114:
ERROR: MPI_Comm_spawn returned errcode[0] = -112
[svbu-mpi001.cisco.com:02846] MPI_ABORT invoked on rank 2 in
communicator MPI_COMM_WORLD with errorcode 1

This looks odd to me ("communication failure"). Ralph -- can you
investigate?

Thanks!

-- 
Jeff Squyres
Cisco Systems