Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] OpenMPI job launch failures
From: Bharath Ramesh (bramesh_at_[hidden])
Date: 2013-02-14 10:21:35

On our cluster we are noticing intermediate job launch failure when
using OpenMPI. We are currently using OpenMPI-1.6.1 on our cluster and
it is integrated with Torque-4.1.3. It failes even for a simple MPI
hello world applications. The issue is that orted gets launched on all
the nodes but there are a bunch of nodes that dont launch the actual MPI
application. There are no errors reported when the job gets killed
because the walltime expires. Enabling --debug-daemons doesnt show any
errors either. The only difference being that successful runs have
MPI_proctable listed and for failures this is absent. Any help in
debugging this issue is greatly appreciated.