Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Ralph Castain (rhc_at_[hidden])
Date: 2007-01-29 20:47:58


On 1/29/07 6:38 PM, "Jeff Squyres" <jsquyres_at_[hidden]> wrote:

> On Jan 19, 2007, at 5:21 PM, Evan Smyth wrote:
>
>> I had been using MPICH and its serv_p4 daemon to speed startup times.
>> I've decided to try OpenMPI (primarily for the fault-tolerance
>> features)
>> and would like to know what the equivalent of the serv_p4 daemon is.
>
> We don't yet have one. "Persistent" daemon operations is planned and
> somewhat functional, but I wouldn't call it robust yet.
>
> Ralph will likely correct some inaccuracies in the above statement. :-)

Ah now, would I do that?? :-)

Actually, I concur with Jeff's assessment. We really don't have that
"virtual machine" functionality yet. I've worked on it a little, but am
probably a few weeks away from completing it. It won't be in the 1.2
release, but (hopefully) will be in an update to that release in the
not-too-distant future.

>
>> It appears as though the orted daemon may be what I am after but I
>> don't
>> quite understand it. I used to run serv_p4 with a specific port number
>> and then pass a -p4ssport <portnumber> flag to mpirun. The daemon
>> would
>> remain running on each node and each new mpirun job would simply
>> communicate directly through a port with the already running
>> instance of
>> the daemon on that machine and would save the mpirun from having to
>> launch an rsh. This was great for reducing startup and run times
>> due to
>> rsh issues. The orted daemon does support a -persistent flag which
>> seems
>> relevant, but I cannot find a real usage example.
>>
>> I expect that most of the readers will find this to be a trivial
>> problem
>> but I'm hoping someone can give me an openmpi equivalent usage
>> example.
>
> We usually rely on resource managers (e.g., slurm and the like) for
> fast statrtup, which is why persistent daemon-based operation wasn't
> high on the priority list.
>
> LAM, for example, has a persistent daemon mode which works quite
> nicely. But LAM lacks many of the advanced features in OMPI's MPI
> layer.