Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] openmpi-1.3a1r18241 ompi-restart issue
From: Sharon Brunett (sharon_at_[hidden])
Date: 2008-04-25 19:12:29


Josh,
I'm responding to some outstanding questions about the env. I'm trying to ompi-restart in.
My answers to your questions are sprinkled below, and include a few more questions based on attempts I've made to get a multi-node restart working.

thanks,
Sharon

Sharon Brunett wrote:
> Josh Hursey wrote:
>> On Apr 23, 2008, at 4:04 PM, Sharon Brunett wrote:
>>
>>> Hello,
>>> I'm using openmpi-1.3a1r18241 on a 2 node configuration and having
>>> troubles with the ompi-restart. I can successfully ompi-checkpoint
>>> and ompi-restart a 1 way mpi code.
>>> When I try a 2 way job running across 2 nodes, I get
>>>
>>> bash-2.05b$ ompi-restart -verbose ompi_global_snapshot_926.ckpt
>>> [shc005:01159] Checking for the existence of (/home/sharon/
>>> ompi_global_snapshot_926.ckpt)
>>> [shc005:01159] Restarting from file (ompi_global_snapshot_926.ckpt)
>>> [shc005:01159] Exec in self
>>> Restart failed: Permission denied
>>> Restart failed: Permission denied
>>>
>> This error is coming from BLCR. A few things to check.
>>
>> First take a look at /var/log/messages on the machine(s) you are
>> trying to restart on. Per:
>> http://upc-bugs.lbl.gov/blcr/doc/html/FAQ.html#eperm
>>
>> Next check to make sure prelinking is turned off on the two machines
>> you are using. Per:
>> http://upc-bugs.lbl.gov/blcr/doc/html/FAQ.html#prelink
>>
>> Those will rule out some common BLCR problems. (more below)
>>
>>> If I try running as root, using the same snapshot file, the code
>>> restarts ok, but both tasks and up on the same node, rather than one
>>> per node (like the original mpirun).
>> You should never have to run as root to restart a process (or to run
>> Open MPI in any form). So I'm wondering if your user has permissions
>> to access the checkpoint files that BLCR is generating. You can look
>> at the permissions for the individual checkpoint files by looking into
>> the checkpoint handler directory. They are a bit hidden, so something
>> like the following should expose them:
>> -------------------
>> shell$ ls -la /home/sharon/ompi_global_snapshot_926.ckpt/0/
>> opal_snapshot_0.ckpt/
>> total 1756
>> drwx------ 2 sharon users 4096 Apr 23 16:29 .
>> drwx------ 4 sharon users 4096 Apr 23 16:29 ..
>> -rw------- 1 sharon users 1780180 Apr 23 16:29 ompi_blcr_context.31849
>> -rw-r--r-- 1 sharon users 35 Apr 23 16:29 snapshot_meta.data
>> shell$
>> shell$ ls -la /home/sharon/ompi_global_snapshot_926.ckpt/0/
>> opal_snapshot_1.ckpt/
>> total 1756
>> drwx------ 2 sharon users 4096 Apr 23 16:29 .
>> drwx------ 4 sharon users 4096 Apr 23 16:29 ..
>> -rw------- 1 sharon users 1780180 Apr 23 16:29 ompi_blcr_context.31850
>> -rw-r--r-- 1 sharon users 35 Apr 23 16:29 snapshot_meta.data
>> -------------------
>>
>> The BLCR generated context files are "ompi_blcr_context.PID", and you
>> need to check to make sure that you have sufficient permissions to
>> access to those files (something like above).
>>
>>> I'm using BLCR version 0.6.5.
>>> I generate checkpoints via 'ompi-checkpoint pid'
>>> where pid is the pid of the mpirun task below
>>>
>>> mpirun -np 2 -am ft-enable-cr ./xhpl
>>>
>> Are you running in a managed environment (e.g., using Torque or
>> Slurm)? Odds are once you switched to root you lost your environmental
>> symbols for your allocation (which is how Open MPI detects when to use
>> an allocation). This would explain why the processes were restarted on
>> one node instead of two.
>>
Maui/torque is the scheduler/resource manager combo being used. I have been trying, to no avail, to push a machinefile (listing the hostnames of the nodes given to me by maui/torque) at ompi-restart which can in turn pass this on to mpirun. Any suggestions on how to do this? --verbose passed to ompi-restart isn't very verbose about what's going on.

>> ompi-restart uses mpirun underneath to do the process launch in
>> exactly the same way the normal mpirun. So the mapping of processes
>> should be the same. That being said there is a bug that I'm tracking
>> in which they are not. This bug has nothing to do with restarting
>> processes, and more with a bookkeeping error when using app files.
>>
>>
Right, I doubt the bug has anything to do with my basic problems of not launching the mpi tasks across 2 nodes rather than just the node mpirun is sitting on.

Thanks,
Sharon