On Tue, Dec 1, 2009 at 18:03, Ralph Castain
<rhc@open-mpi.org> wrote:
You may want to check your limits as defined by the shell/system. I can also run this for as long as I'm willing to let it run, so something else appears to be going on.
Is that with 1.3.3? I found that with 1.3.4 I can run the example much longer until I hit this error message:
[master] (31996) forking processes
[mujo:14273] opal_os_dirpath_create: Error: Unable to create the sub-directory (/tmp/.private/nbock/openmpi-sessions-nbock@mujo_0/13386/31998) of (/tmp/.private/nbock/openmpi-sessions-nbock@mujo_0/13386/31998/0), mkdir failed [1]
[mujo:14273] [[13386,31998],0] ORTE_ERROR_LOG: Error in file util/session_dir.c at line 101
[mujo:14273] [[13386,31998],0] ORTE_ERROR_LOG: Error in file util/session_dir.c at line 425
[mujo:14273] [[13386,31998],0] ORTE_ERROR_LOG: Error in file base/ess_base_std_app.c at line 132
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):
orte_session_dir failed
--> Returned value Error (-1) instead of ORTE_SUCCESS