Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Jeff Squyres \(jsquyres\) (jsquyres_at_[hidden])
Date: 2006-07-05 09:48:06


> -----Original Message-----
> From: users-bounces_at_[hidden]
> [mailto:users-bounces_at_[hidden]] On Behalf Of Jack Howarth
> Sent: Monday, July 03, 2006 10:35 AM
> To: users_at_[hidden]
> Subject: [OMPI users] open-mpi on MacOS X
>
> I have created simple fink (http://fink.sourceforge.net)
> packaging
> for open-mpi v1.1 on MacOS X. The packaging builds open-mpi with its
> default settings in configure and appears to pass all of its
> make check
> without problems.

Thanks!

> However, the lack of clear documentation
> for open-mpi
> still is a problem.

Agreed. This is something that we're actively working on.

In the meantime, feel free to send your questions to this list.

> I seem able to manually run the test programs from
> the open-mpi distribution using...
>
> mdrun -np 2 ...

Just to clarify -- what is mdrun? Do you mean mpirun? Open MPI does
not provide an executable named "mdrun".
 
> after starting the orted daemon with....
>
> orted --seed --persistent --scope public

Per Brock's comments, you don't need to start the orted manually.
Indeed, this model is only loosely tested -- it has known problems with
not releasing all resources at the end of each mpirun (e.g., the memory
footprint of that orted will keep growing over time). See below.

> I can see both cpus spike when I do the mdrun's so I think
> that works. However, I can't figure how the proper way to
> monitor the status of the available nodes. Specifically,
> what is the equivalent to the lamnodes program in open-mpi?

Right now, Open MPI does not have a "persistent universe" model like
LAM's (e.g., lamboot over a bunch of nodes). orted's are launched
behind the scenes for each job for each node (e.g., in the rsh/ssh case,
we rsh/ssh to each node once, launch an orted, and then the orted
launches as many user processes as necessary).

However, equivalent to LAM, Open MPI can use the back-end
schedule/resource manager to know which nodes to launch on. Even with
lamboot, you had to specify a hostfile or have a back-end resource
manager that said "use these nodes." lamnodes was not really a
monitoring tool -- it was more of a "here's the nodes that you specified
to me earlier" tool.

If you really want monitoring tools for your nodes, you might want to
look outside of MPI -- SLURM and Torque are fairly common open source
resource managers. And there's a bunch of tools available for
monitoring nodes in a cluster, too.

> Also, is there a simple test program that runs for a significant
> period of time that I can use to test the different options to
> monitor and control the open-mpi jobs that are running under
> orted? Thanks in advance for any clarifications.

Open MPI's run-time options are [essentially] read at startup and used
for the duration of the job's run. Most of the options are not
changeable after a given run has started.

We have not yet included any sample apps inside Open MPI (a la LAM), but
we'll likely include some simple "hello world" and other well-known
sample MPI apps in the future. For long-running tests, you might want
to run any of the MPI benchmark suites available (e.g., NetPIPE, the
Intel benchmarks, HPL, etc.).

> Jack
> ps I assume that at v1.1, open-mpi is considered to be a usable
> replacement for lam? Certainly, gromacs 3.3.1 seems to compile
> its mpi support against open-mpi.

Yes. There are still some features in LAM that are not yet in Open MPI
(e.g., a persistent universe), but most of the good/important ones are
being added to Open MPI over time.

-- 
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems