Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Ralph H. Castain (rhc_at_[hidden])
Date: 2005-07-28 17:41:43

Very interesting! Appreciate the info. My numbers are slightly better
- as I've indicated, there is a NxN message exchange currently in the
system that needs to be removed. With that commented out, the system
scales roughly linearly with number of processes.

At 04:31 PM 7/28/2005, you wrote:
>I have removed the ompi_ignores from the new bproc components I have been
>working on and they are now the default for bproc. These new components
>have several advantages over the old bproc component but mainly:
> - we now provide ptys support for standard IO
> - it should work better with threaded applications(although this has not
>been tested).
> - We also now support Scyld bproc and old versions of LANL bproc using a
>serial launch as opposed to the parallel launch used for newer bproc
>versions. (Although I do not have a box to test this on so any reports on
>how it works would be appreciated)
>Their use is the same as before: set your NODES environment variable to a
>comma delimited list of the nodes to run on.
>The new launcher seems to be pretty scalable. Below are 2 charts where I
>ran 'hostname' and a trivial mpi program on varying numbers of nodes with
>both 1 and 2 processes per node (all times are in seconds).
>Running Hostname:
>Nodes 1 per node 2 per node
>1 .162 .172
>2 .202 .224
>4 .243 .251
>8 .260 .275
>16 .305 .321
>32 .360 .412
>64 .524 .708
>128 1.036 1.627
>Running a trivial mpi process(Init/finalize)
>Nodes 1 per node 2 per node
>1 .33 .46
>2 .44 .63
>4 .56 .77
>8 .61 .89
>16 .71 1.1
>32 .88 1.5
>64 1.4 3.5
>128 3.1 9.2
>The frontend and nodes are dual Opteron 242 with 2 GB RAM and GigE.
>I have been told that there are some NxN exchanges going on in the mpi
>processes which are probably tainting the running time.
>The launcher is split into 2 separate components. The general idea is:
> 1. pls_bproc is called by orterun. It figures out the process mapping and
> launches orted's on the nodes
> 2. pls_bproc_orted is called by orted. This module initializes either a
>pty or
> pipes, places symlinks to them in well know points of the filesystem, and
> sets up the io forwarding. It then sends an ack back to orterun.
> 3. pls_bproc waits for an ack to come back from the orteds, then does
> parallel launches of the application processes. The number of launches is
> equal to the maximum number of processes on a node.
>Let me know if there are any problems,
>devel mailing list