Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] ompi-checkpoint hangs when using in multiple clusters
From: Fernando Lemos (fernandotcl_at_[hidden])
Date: 2010-03-23 21:58:55


On Tue, Mar 23, 2010 at 1:25 PM, fengguang tian <fernyabc_at_[hidden]> wrote:
> now, I set $HOME as shared directory, but when doing ompi-checkpoint, it
> shows:(nimbus1 is the remote machine in
> my cluster)
>
> [nimbus1:12630] opal_os_dirpath_create: Error: Unable to create the
> sub-directory (/home/mpiu/ompi_global_snapshot_1662.ckpt/0) of
> (/home/mpiu/ompi_global_snapshot_1662.ckpt/0/opal_snapshot_4.ckpt), mkdir
> failed [1]
> [nimbus1:12630] Error: No metadata filename specified!
>
> why is that?

The error is described in the error message...

[nimbus1:12630] opal_os_dirpath_create: Error: Unable to create the
sub-directory (/home/mpiu/ompi_global_snapshot_1662.ckpt/0) of
(/home/mpiu/ompi_global_snapshot_1662.ckpt/0/opal_snapshot_4.ckpt),
mkdir failed [1]

If the number between brackets is errno, that is EPERM, "Operation not
permitted". Most likely the user running mpirun doesn't have the
necessary privileges to write to the shared file system (i.e., the
file system is mounted read-only or you don't have write access to the
directory or something of that sort).

Also, please make sure you don't post the same issue twice to the mailing list.