Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Doing a lot of spawns does not work with ompi 1.3 BUT works with ompi 1.2.7
From: Ralph Castain (rhc_at_[hidden])
Date: 2009-01-27 10:20:12


Just to be clear - you are doing over 1000 MPI_Comm_spawn calls to
launch all the procs on a single node???

In the 1.2 series, every call to MPI_Comm_spawn would launch another
daemon on the node, which would then fork/exec the specified app. If
you look at your process table, you will see a whole lot of "orted"
processes. Thus, you wouldn't run out of pipes because every orted
only opened enough for a single process.

In the 1.3 series, there is only one daemon on each node (mpirun fills
that function on its node). MPI_Comm_spawn simply reuses that daemon
to launch the new proc(s). Thus, there is a limit to the number of
procs you can start on any node that is set by the #pipes a process
can open.

You can adjust that number, of course. You can look it up readily
enough for your particular system. However, you may find that 1000
comm_spawns on a single node will lead to poor performance as the
procs contend for processor attention.

Hope that helps
Ralph

On Jan 27, 2009, at 7:59 AM, Anthony Thevenin wrote:

> Hello,
>
> I have two C codes :
> - master.c : spawns a slave
> - slave.c : spwaned by the master
>
> If the spawn is include in a do-loop, I can do only 123 spawns
> before having the folowing errors:
>
> ORTE_ERROR_LOG: The system limit on number of pipes a process can
> open was reached in file base/iof_base_setup.c at line 112
> ORTE_ERROR_LOG: The system limit on number of pipes a process can
> open was reached in file odls_default_module.c at line 203
>
> This test works perfectly even for a lot of spawns (more than 1000)
> with Open-MPI 1.2.7.
>
> You will find the following files attached:
> config.log.tgz
> ompi_info.out.tgz
> ifconfig.out.tgz
> master.c.tgz
> slave.c.tgz
>
>
> command used to run my application :
> mpirun -n 1 ./master
>
> COMPILER:
> PGI 7.1
>
> PATH : /space/thevenin/openmpi-1.3_pgi/bin:/usr/local/tecplot/bin:/
> usr/local/pgi/linux86-64/7.1/bin:/usr/totalview/bin:/usr/local/
> matlab71/bin:/usr/bin:/usr/ucb:/usr/sbin:/usr/bsd:/sbin:/bin:/usr/
> bin/X11:/usr/etc:/usr/local/bin:/usr/bin:/usr/bsd:/sbin:/usr/bin/X11:.
>
> LD_LIBRARY_PATH:
> /space/thevenin/openmpi-1.3_pgi/lib:/usr/local/lib
>
>
> If you have any idea of what this occurs, please tell me what to do
> to make it works.
> Thank you very much
>
>
> Anthony
>
>
>
> <
> config
> .log
> .tgz
> >
> <
> ifconfig
> .out
> .tgz
> >
> <
> master
> .c
> .tgz
> >
> <
> ompi_info
> .out.tgz><slave.c.tgz>_______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users