Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: [OMPI devel] Return code and error message problems
From: Tim Prins (tprins_at_[hidden])
Date: 2008-03-25 08:35:13


Hi,

Something went wrong last night and all our MTT tests had the following
output:
[odin005.cs.indiana.edu:28167] [[46567,0],0] ORTE_ERROR_LOG: Error in file
base/plm_base_launch_support.c at line 161
--------------------------------------------------------------------------
mpirun was unable to start the specified application as it encountered
an error.
More information may be available above.
--------------------------------------------------------------------------

I have not tracked down what caused this, but the more immediate problem
is that after giving this error mpirun returned '0' instead of a more
sane error value.

Also, when running the test 'orte/test/mpi/abort' I get the error output:
--------------------------------------------------------------------------
mpirun has exited due to process rank 1 with PID 17822 on
node odin013 exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------

Which is wrong, it should be saying that the process was aborted. It
looks like somehow the job state is being set to
ORTE_JOB_STATE_ABORTED_WO_SYNC instead of ORTE_JOB_STATE_ABORTED.

Thanks,

Tim