Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: [OMPI devel] 1.7.4rc: yet another launch failure
From: Paul Hargrove (phhargrove_at_[hidden])
Date: 2014-01-22 22:02:09

On yet another test platform I see the following:

$ mpirun -mca btl sm,self -np 1 examples/ring_c
Open MPI was unable to obtain the username in order to create a path
for its required temporary directories. This type of error is usually
caused by a transient failure of network-based authentication services
(e.g., LDAP or NIS failure due to network congestion), but can also be
an indication of system misconfiguration.

Please consult your system administrator about these issues and try
[] [[40214,0],0] ORTE_ERROR_LOG: Out of resource in
at line 380
[] [[40214,0],0] ORTE_ERROR_LOG: Out of resource in
at line 599
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_session_dir failed
  --> Returned value Out of resource (-2) instead of ORTE_SUCCESS

An "-np 2" run fails in the same manner.
This is a production system and there is no problem with "whoami" or "id",
leaving me doubting the explanation provided by the error message.

[phh1_at_biou2 ~]$ whoami
[phh1_at_biou2 ~]$ id
uid=44154(phh1) gid=2016(hpc)

The "ompi_info --all" output is attached.
Please let me know what additional info is needed.


Paul H. Hargrove                          PHHargrove_at_[hidden]
Future Technologies Group
Computer and Data Sciences Department     Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900