Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: [OMPI devel] Return code and error message problems
From: Tim Prins (tprins_at_[hidden])
Date: 2008-03-25 08:35:13


Hi,

Something went wrong last night and all our MTT tests had the following
output:
[odin005.cs.indiana.edu:28167] [[46567,0],0] ORTE_ERROR_LOG: Error in file
base/plm_base_launch_support.c at line 161
--------------------------------------------------------------------------
mpirun was unable to start the specified application as it encountered
an error.
More information may be available above.
--------------------------------------------------------------------------

I have not tracked down what caused this, but the more immediate problem
is that after giving this error mpirun returned '0' instead of a more
sane error value.

Also, when running the test 'orte/test/mpi/abort' I get the error output:
--------------------------------------------------------------------------
mpirun has exited due to process rank 1 with PID 17822 on
node odin013 exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------

Which is wrong, it should be saying that the process was aborted. It
looks like somehow the job state is being set to
ORTE_JOB_STATE_ABORTED_WO_SYNC instead of ORTE_JOB_STATE_ABORTED.

Thanks,

Tim