Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] Issue with : mca_oob_tcp_peer_recv_connect_ack on SGI Altix
From: Gilbert Grosdidier (Gilbert.Grosdidier_at_[hidden])
Date: 2010-12-15 03:05:41


Bonjour,

  Running with OpenMPI 1.4.3 on an SGI Altix cluster with 4096 cores, I got
this error message, right at startup :
mca_oob_tcp_peer_recv_connect_ack: received unexpected process
identifier [[13816,0],209]

  and the whole job is going to spin for an undefined period, without
crashing/aborting.

  What could be the culprit please ?
Is there a workaround ?
Which parameter is to be tuned ?

  Thanks in advance for any help, Best, G.