Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] MPI_Comm_spawn lots of times
From: Nicolas Bock (nicolasbock_at_[hidden])
Date: 2009-12-01 22:58:16


On Tue, Dec 1, 2009 at 18:03, Ralph Castain <rhc_at_[hidden]> wrote:

> You may want to check your limits as defined by the shell/system. I can
> also run this for as long as I'm willing to let it run, so something else
> appears to be going on.
>
>
>
Is that with 1.3.3? I found that with 1.3.4 I can run the example much
longer until I hit this error message:

[master] (31996) forking processes
[mujo:14273] opal_os_dirpath_create: Error: Unable to create the
sub-directory (/tmp/.private/nbock/openmpi-sessions-nbock_at_mujo_0/13386/31998)
of (/tmp/.private/nbock/openmpi-sessions-nbock_at_mujo_0/13386/31998/0), mkdir
failed [1]
[mujo:14273] [[13386,31998],0] ORTE_ERROR_LOG: Error in file
util/session_dir.c at line 101
[mujo:14273] [[13386,31998],0] ORTE_ERROR_LOG: Error in file
util/session_dir.c at line 425
[mujo:14273] [[13386,31998],0] ORTE_ERROR_LOG: Error in file
base/ess_base_std_app.c at line 132
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_session_dir failed
  --> Returned value Error (-1) instead of ORTE_SUCCESS

> On Dec 1, 2009, at 4:38 PM, Nicolas Bock wrote:
>
>
>
> On Tue, Dec 1, 2009 at 16:28, Abhishek Kulkarni <abbyzcool_at_[hidden]>wrote:
>
>> On Tue, Dec 1, 2009 at 6:15 PM, Nicolas Bock <nicolasbock_at_[hidden]>
>> wrote:
>> > After reading Anthony's question again, I am not sure now that we are
>> having
>> > the same problem, but we might. In any case, the attached example
>> programs
>> > trigger the issue of running out of pipes. I don't see how orted could,
>> even
>> > if it was reused. There is only a very limited number of processes
>> running
>> > at any given time. Once slave terminates, how would it still have open
>> > pipes? Shouldn't the total number of open files, or pipes, be very
>> limited
>> > in this situation? And yet, after maybe 20 or so iterations in master.c,
>> > orted complains about running out of pipes.
>> >
>> > nick
>> >
>>
>> What version of OMPI are you trying it with? I can easily run it up to
>> more
>> than 200 iterations.
>>
>>
> openmpi-1.3.3
>
>
>
>> Also, how many nodes are you running this on?
>>
>> This is on one node with 4 cores. I am using
>
> mpirun -np 1
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>