Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure
From: Eloi Gaudry (eg_at_[hidden])
Date: 2009-11-10 13:17:54


Reuti,

The acl here were just added when I tried to force the /opt/sge/tmp
subdirectories to be 777 (which I did when I first encountered the error
of subdirectories creation within OpenMPI). I don't think the info I'll
provide will be meaningfull here:

moe:~# getfacl /opt/sge/tmp
getfacl: Removing leading '/' from absolute path names
# file: opt/sge/tmp
# owner: sgeadmin
# group: fft
user::rwx
group::rwx
mask::rwx
other::rwx
default:user::rwx
default:group::rwx
default:group:fft:rwx
default:mask::rwx
default:other::rwx

I'll try to use a local directory instead of a shared one for "tmpdir".
But as this issue seems somehow related to permissions, I don't know if
this would eventually be the rigth solution.

Thanks for your help,
Eloi

Reuti wrote:
> Hi,
>
> Am 10.11.2009 um 19:01 schrieb Eloi Gaudry:
>
>> Reuti,
>>
>> I'm using "tmpdir" as a shared directory that contains the session
>> directories created during job submission, not for computing or local
>> storage. Doesn't the session directory (i.e. job_id.queue_name) need
>> to be shared among all computing nodes (at least the ones that would
>> be used with orted during the parallel computation) ?
>
> no. orted runs happily with local $TMPDIR on each and every node. The
> $TMPDIRs are intended to be used by the user for any temporary data
> for his job, as they are created and removed by SGE automatically for
> every job for his convenience.
>
>
>> All sequential job run fine, as no write operation is performed in
>> "tmpdir/session_directory".
>>
>> All users are known on the computing nodes and the master node (with
>> use ldap authentication on all nodes).
>>
>> As for the access checkings:
>> moe:~# ls -alrtd /opt/sge/tmp
>> drwxrwxrwx+ 2 sgeadmin fft 4096 2009-11-10 18:28 /opt/sge/tmp
>
> Aha, the + tells that there are some ACLs set:
>
> getfacl /opt/sge/tmp
>
>
>> And for the parallel environment configuration:
>> moe:~# qconf -sp round_robin
>> pe_name round_robin
>> slots 32
>> user_lists NONE
>> xuser_lists NONE
>> start_proc_args /bin/true
>> stop_proc_args /bin/true
>> allocation_rule $round_robin
>> control_slaves TRUE
>> job_is_first_task FALSE
>> urgency_slots min
>> accounting_summary FALSE
>
> Okay, fine.
>
> -- Reuti
>
>
>> Thanks for your help,
>> Eloi
>>
>> Reuti wrote:
>>> Am 10.11.2009 um 18:20 schrieb Eloi Gaudry:
>>>
>>>> Thanks for your help Reuti,
>>>>
>>>> I'm using a nfs-shared directory (/opt/sge/tmp), exported from the
>>>> master node to all others computing nodes.
>>>
>>> It's higly advisable to have the "tmpdir" local on each node. When
>>> you use "cd $TMPDIR" in your jobscript, all is done local on a node
>>> (when your application will just create the scratch file in your
>>> current working directory) which will speed up the computation and
>>> decrease the network traffic. Computing in as shared /opt/sge/tmp is
>>> like computing in each user's home directory.
>>>
>>> To avoid that any user can remove someone else's files, the "t" flag
>>> is set like for /tmp: drwxrwxrwt 14 root root 4096 2009-11-10 18:35
>>> /tmp/
>>>
>>> Nevertheless:
>>>
>>>> with for /etc/export on server (named moe.fft): /opt/sge
>>>> 192.168.0.0/255.255.255.0(rw,sync,no_subtree_check)
>>>> /etc/fstab on
>>>> client:
>>>> moe.fft:/opt/sge
>>>> /opt/sge nfs
>>>> rw,bg,soft,timeo=14, 0 0
>>>> Actually, the /opt/sge/tmp directory is 777 across all machines,
>>>> thus all user should be able to create a directory inside.
>>>
>>> All access checkings will be applied:
>>>
>>> - on the server: what is "ls -d /opt/sge/tmp" showing?
>>> - the one from the export (this seems to be fine)
>>> - the one on the node (i.e., how it's mounted: cat /etc/fstab)
>>>
>>>> The issue seems somehow related to the session directory created
>>>> inside /opt/sge/tmp, let's stay /opt/sge/tmp/29.1.smp8.q for
>>>> example for the job 29 on queue smp8.q. This subdirectory of
>>>> /opt/sge/tmp is created with nobody:nogroup drwxr-xr-x
>>>> permissions... which in turn forbids
>>>
>>> Did you try to run some simple jobs before the parallel ones - are
>>> these working? The daemons (qmaster and execd) were started as root?
>>>
>>> The user is known on the file server, i.e. the machine hosting
>>> /opt/sge?
>>>
>>>> OpenMPI to create its subtree inside (as OpenMPI won't use
>>>> nobody:nogroup credentials).
>>>
>>> In SGE the master process (the one running the job script) will
>>> create the /opt/sge/tmp/29.1.smp8.q and also each started qrsh
>>> inside SGE - all with the same name. What is your definition of the
>>> PE in SGE which you use?
>>>
>>> -- Reuti
>>>
>>>
>>>> Ad Ralph suggested, I checked the SGE configuration, but I haven't
>>>> found anything related to nobody:nogroup configuration so far.
>>>>
>>>> Eloi
>>>>
>>>>
>>>> Reuti wrote:
>>>>> Hi,
>>>>>
>>>>> Am 10.11.2009 um 17:55 schrieb Eloi Gaudry:
>>>>>
>>>>>> Thanks for your help Ralph, I'll double check that.
>>>>>>
>>>>>> As for the error message received, there might be some
>>>>>> inconsistency:
>>>>>> "/opt/sge/tmp/25.1.smp8.q/openmpi-sessions-eg_at_charlie_0" is the
>>>>>
>>>>> often /opt/sge is shared across the nodes, while the /tmp
>>>>> (sometimes implemented as /scratch in a partition on its own)
>>>>> should be local on each node.
>>>>>
>>>>> What is the setting of "tmpdir" in your queue definition?
>>>>>
>>>>> If you want to share /opt/sge/tmp, everyone must be able to write
>>>>> into this location. As for me it's working fine (with the local
>>>>> /tmp), I assume the nobody/nogroup comes from any squash-setting
>>>>> in the /etc/export of you master node.
>>>>>
>>>>> -- Reuti
>>>>>
>>>>>
>>>>>> parent directory and
>>>>>> "/opt/sge/tmp/25.1.smp8.q/openmpi-sessions-eg_at_charlie_0/53199/0/0"
>>>>>> is the subdirectory... not the other way around.
>>>>>>
>>>>>> Eloi
>>>>>>
>>>>>>
>>>>>>
>>>>>> Ralph Castain wrote:
>>>>>>> Creating a directory with such credentials sounds like a bug in
>>>>>>> SGE to me...perhaps an SGE config issue?
>>>>>>>
>>>>>>> Only thing you could do is tell OMPI to use some other directory
>>>>>>> as the root for its session dir tree - check "mpirun -h", or
>>>>>>> ompi_info for the required option.
>>>>>>>
>>>>>>> But I would first check your SGE config as that just doesn't
>>>>>>> sound right.
>>>>>>>
>>>>>>> On Nov 10, 2009, at 9:40 AM, Eloi Gaudry wrote:
>>>>>>>
>>>>>>>> Hi there,
>>>>>>>>
>>>>>>>> I'm experiencing some issues using GE6.2U4 and OpenMPI-1.3.3
>>>>>>>> (with gridengine compnent).
>>>>>>>>
>>>>>>>> During any job submission, SGE creates a session directory in
>>>>>>>> $TMPDIR, named after the job id and the computing node name.
>>>>>>>> This session directory is created using nobody/nogroup
>>>>>>>> credentials.
>>>>>>>>
>>>>>>>> When using OpenMPI with tight-integration, opal creates
>>>>>>>> different subdirectories in this session directory. The issue
>>>>>>>> I'm facing now is that OpenMPI fails to create these
>>>>>>>> subdirectories:
>>>>>>>>
>>>>>>>> [charlie:03882] opal_os_dirpath_create: Error: Unable to create
>>>>>>>> the sub-directory
>>>>>>>> (/opt/sge/tmp/25.1.smp8.q/openmpi-sessions-eg_at_charlie_0) of
>>>>>>>> (/opt/sge/tmp/25.1.smp8.q/openmpi-sessions-eg_at_charlie_0
>>>>>>>> [charlie:03882] [[53199,0],0] ORTE_ERROR_LOG: Error in file
>>>>>>>> ../../openmpi-1.3.3/orte/util/session_dir.c at line 101
>>>>>>>> [charlie:03882] [[53199,0],0] ORTE_ERROR_LOG: Error in file
>>>>>>>> ../../openmpi-1.3.3/orte/util/session_dir.c at line 425
>>>>>>>> [charlie:03882] [[53199,0],0] ORTE_ERROR_LOG: Error in file
>>>>>>>> ../../../../../openmpi-1.3.3/orte/mca/ess/hnp/ess_hnp_module.c
>>>>>>>> at line 273
>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>
>>>>>>>> It looks like orte_init failed for some reason; your parallel
>>>>>>>> process is
>>>>>>>> likely to abort. There are many reasons that a parallel
>>>>>>>> process can
>>>>>>>> fail during orte_init; some of which are due to configuration or
>>>>>>>> environment problems. This failure appears to be an internal
>>>>>>>> failure;
>>>>>>>> here's some additional information (which may only be relevant
>>>>>>>> to an
>>>>>>>> Open MPI developer):
>>>>>>>>
>>>>>>>> orte_session_dir failed
>>>>>>>> --> Returned value Error (-1) instead of ORTE_SUCCESS
>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>
>>>>>>>> [charlie:03882] [[53199,0],0] ORTE_ERROR_LOG: Error in file
>>>>>>>> ../../openmpi-1.3.3/orte/runtime/orte_init.c at line 132
>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>
>>>>>>>> It looks like orte_init failed for some reason; your parallel
>>>>>>>> process is
>>>>>>>> likely to abort. There are many reasons that a parallel
>>>>>>>> process can
>>>>>>>> fail during orte_init; some of which are due to configuration or
>>>>>>>> environment problems. This failure appears to be an internal
>>>>>>>> failure;
>>>>>>>> here's some additional information (which may only be relevant
>>>>>>>> to an
>>>>>>>> Open MPI developer):
>>>>>>>>
>>>>>>>> orte_ess_set_name failed
>>>>>>>> --> Returned value Error (-1) instead of ORTE_SUCCESS
>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>
>>>>>>>> [charlie:03882] [[53199,0],0] ORTE_ERROR_LOG: Error in file
>>>>>>>> ../../../../openmpi-1.3.3/orte/tools/orterun/orterun.c at line 473
>>>>>>>>
>>>>>>>> This seems very likely related to the permissions set on $TMPDIR.
>>>>>>>>
>>>>>>>> I'd like to know if someone might have experienced the same or
>>>>>>>> a similar issue and if any solution was found.
>>>>>>>>
>>>>>>>> Thanks for your help,
>>>>>>>> Eloi
>>>>>>>>
>>>>>>>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Eloi Gaudry
Free Field Technologies
Axis Park Louvain-la-Neuve
Rue Emile Francqui, 1
B-1435 Mont-Saint Guibert
BELGIUM
Company Phone: +32 10 487 959
Company Fax:   +32 10 454 626