Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] openmpi-1.3a1r18241 ompi-restart issue
From: Josh Hursey (jjhursey_at_[hidden])
Date: 2008-04-28 20:24:28

On Apr 25, 2008, at 6:12 PM, Sharon Brunett wrote:

> Josh,
> I'm responding to some outstanding questions about the env. I'm
> trying to ompi-restart in.
> My answers to your questions are sprinkled below, and include a few
> more questions based on attempts I've made to get a multi-node
> restart working.
> thanks,
> Sharon
> Sharon Brunett wrote:
>> Josh Hursey wrote:
>>> On Apr 23, 2008, at 4:04 PM, Sharon Brunett wrote:
>>>> Hello,
>>>> I'm using openmpi-1.3a1r18241 on a 2 node configuration and having
>>>> troubles with the ompi-restart. I can successfully ompi-checkpoint
>>>> and ompi-restart a 1 way mpi code.
>>>> When I try a 2 way job running across 2 nodes, I get
>>>> bash-2.05b$ ompi-restart -verbose ompi_global_snapshot_926.ckpt
>>>> [shc005:01159] Checking for the existence of (/home/sharon/
>>>> ompi_global_snapshot_926.ckpt)
>>>> [shc005:01159] Restarting from file (ompi_global_snapshot_926.ckpt)
>>>> [shc005:01159] Exec in self
>>>> Restart failed: Permission denied
>>>> Restart failed: Permission denied
>>> This error is coming from BLCR. A few things to check.
>>> First take a look at /var/log/messages on the machine(s) you are
>>> trying to restart on. Per:
>>> Next check to make sure prelinking is turned off on the two machines
>>> you are using. Per:
>>> Those will rule out some common BLCR problems. (more below)
>>>> If I try running as root, using the same snapshot file, the code
>>>> restarts ok, but both tasks and up on the same node, rather than
>>>> one
>>>> per node (like the original mpirun).
>>> You should never have to run as root to restart a process (or to run
>>> Open MPI in any form). So I'm wondering if your user has permissions
>>> to access the checkpoint files that BLCR is generating. You can look
>>> at the permissions for the individual checkpoint files by looking
>>> into
>>> the checkpoint handler directory. They are a bit hidden, so
>>> something
>>> like the following should expose them:
>>> -------------------
>>> shell$ ls -la /home/sharon/ompi_global_snapshot_926.ckpt/0/
>>> opal_snapshot_0.ckpt/
>>> total 1756
>>> drwx------ 2 sharon users 4096 Apr 23 16:29 .
>>> drwx------ 4 sharon users 4096 Apr 23 16:29 ..
>>> -rw------- 1 sharon users 1780180 Apr 23 16:29 ompi_blcr_context.
>>> 31849
>>> -rw-r--r-- 1 sharon users 35 Apr 23 16:29
>>> shell$
>>> shell$ ls -la /home/sharon/ompi_global_snapshot_926.ckpt/0/
>>> opal_snapshot_1.ckpt/
>>> total 1756
>>> drwx------ 2 sharon users 4096 Apr 23 16:29 .
>>> drwx------ 4 sharon users 4096 Apr 23 16:29 ..
>>> -rw------- 1 sharon users 1780180 Apr 23 16:29 ompi_blcr_context.
>>> 31850
>>> -rw-r--r-- 1 sharon users 35 Apr 23 16:29
>>> -------------------
>>> The BLCR generated context files are "ompi_blcr_context.PID", and
>>> you
>>> need to check to make sure that you have sufficient permissions to
>>> access to those files (something like above).
>>>> I'm using BLCR version 0.6.5.
>>>> I generate checkpoints via 'ompi-checkpoint pid'
>>>> where pid is the pid of the mpirun task below
>>>> mpirun -np 2 -am ft-enable-cr ./xhpl
>>> Are you running in a managed environment (e.g., using Torque or
>>> Slurm)? Odds are once you switched to root you lost your
>>> environmental
>>> symbols for your allocation (which is how Open MPI detects when to
>>> use
>>> an allocation). This would explain why the processes were
>>> restarted on
>>> one node instead of two.
> Maui/torque is the scheduler/resource manager combo being used. I
> have been trying, to no avail, to push a machinefile (listing the
> hostnames of the nodes given to me by maui/torque) at ompi-restart
> which can in turn pass this on to mpirun. Any suggestions on how to
> do this? --verbose passed to ompi-restart isn't very verbose about
> what's going on.

If you pass '--help' to ompi-restart it will show you all the command
line options for that command (following UNIX convention). To pass a
hostfile to ompi-restart just use either the --hostfile or --
machinefile options the same way you would orterun. ompi-restart will
pass this to the orterun it starts up.

There is one bug I'm trying to track at the moment with app context
files. In the current trunk processes are not being mapped quite as
consistently as they should be. You may be running into this problem,
but I can't say for sure at the moment.

>>> ompi-restart uses mpirun underneath to do the process launch in
>>> exactly the same way the normal mpirun. So the mapping of processes
>>> should be the same. That being said there is a bug that I'm tracking
>>> in which they are not. This bug has nothing to do with restarting
>>> processes, and more with a bookkeeping error when using app files.
> Right, I doubt the bug has anything to do with my basic problems of
> not launching the mpi tasks across 2 nodes rather than just the node
> mpirun is sitting on.

Did you check the permissions of the resulting checkpoint files to
make sure that you have the proper access to them?

So am I a little confused, are you now able to restart properly now
outside of the hostfile issue described above?

-- Josh

> Thanks,
> Sharon
> _______________________________________________
> users mailing list
> users_at_[hidden]