Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] MPI_Comm_spawn lots of times
From: Nicolas Bock (nicolasbock_at_[hidden])
Date: 2009-12-02 12:24:54


On Tue, Dec 1, 2009 at 20:58, Nicolas Bock <nicolasbock_at_[hidden]> wrote:

>
>
> On Tue, Dec 1, 2009 at 18:03, Ralph Castain <rhc_at_[hidden]> wrote:
>
>> You may want to check your limits as defined by the shell/system. I can
>> also run this for as long as I'm willing to let it run, so something else
>> appears to be going on.
>>
>>
>>
> Is that with 1.3.3? I found that with 1.3.4 I can run the example much
> longer until I hit this error message:
>
>
> [master] (31996) forking processes
> [mujo:14273] opal_os_dirpath_create: Error: Unable to create the
> sub-directory (/tmp/.private/nbock/openmpi-sessions-nbock_at_mujo_0/13386/31998)
> of (/tmp/.private/nbock/openmpi-sessions-nbock_at_mujo_0/13386/31998/0),
> mkdir failed [1]
> [mujo:14273] [[13386,31998],0] ORTE_ERROR_LOG: Error in file
> util/session_dir.c at line 101
> [mujo:14273] [[13386,31998],0] ORTE_ERROR_LOG: Error in file
> util/session_dir.c at line 425
> [mujo:14273] [[13386,31998],0] ORTE_ERROR_LOG: Error in file
> base/ess_base_std_app.c at line 132
> --------------------------------------------------------------------------
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort. There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems. This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
>
> orte_session_dir failed
> --> Returned value Error (-1) instead of ORTE_SUCCESS
>
>
After some googling I found that this is apparently an ext3 filesystem
limitation, i.e. there can be only 31998 subdirectories in a directory. Why
is openmpi creating all of these directories in the first place? Is there a
way to "recycle" them?

nick