Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Tmpdir work for first process only
From: Clement Kam Man Chu (clement.chu_at_[hidden])
Date: 2007-11-15 00:36:00


Hi,

I have configured out why the tmpdir parameter works for the first
process. I got another problem if I tried to run 400 processes (no
problem if under 400 processes). I got an error "ORTE_ERROR_LOG: Out of
resource in file base/iof_base_setup.c at line 106". I attached the
message as below:

[ac27:12442] [0,0,0] setting up session dir with
[ac27:12442] tmpdir /jobfs/z07/247752.ac-pbs
[ac27:12442] universe default-universe-12442
[ac27:12442] user kxc565
[ac27:12442] host ac27
[ac27:12442] jobid 0
[ac27:12442] procid 0
[ac27:12442] procdir:
/jobfs/z07/247752.ac-pbs/openmpi-sessions-kxc565_at_ac27_0/default-universe-12442/0/0
[ac27:12442] jobdir:
/jobfs/z07/247752.ac-pbs/openmpi-sessions-kxc565_at_ac27_0/default-universe-12442/0
[ac27:12442] unidir:
/jobfs/z07/247752.ac-pbs/openmpi-sessions-kxc565_at_ac27_0/default-universe-12442
[ac27:12442] top: openmpi-sessions-kxc565_at_ac27_0
[ac27:12442] tmp: ??
[ac27:12442] [0,0,0] contact_file
/jobfs/z07/247752.ac-pbs/openmpi-sessions-kxc565_at_ac27_0/default-universe-12442/universe-setup.txt
[ac27:12442] [0,0,0] wrote setup file
[ac27:12447] [0,0,1] setting up session dir with
[ac27:12447] universe default-universe-12442
[ac27:12447] user kxc565
[ac27:12447] host ac27
[ac27:12447] jobid 0
[ac27:12447] procid 1
[ac27:12447] procdir:
/jobfs/z07/247752.ac-pbs/openmpi-sessions-kxc565_at_ac27_0/default-universe-12442/0/1
[ac27:12447] jobdir:
/jobfs/z07/247752.ac-pbs/openmpi-sessions-kxc565_at_ac27_0/default-universe-12442/0
[ac27:12447] unidir:
/jobfs/z07/247752.ac-pbs/openmpi-sessions-kxc565_at_ac27_0/default-universe-12442
[ac27:12447] top: openmpi-sessions-kxc565_at_ac27_0
[ac27:12447] tmp: /jobfs/z07/247752.ac-pbs
[ac27:12447] [0,0,1] ORTE_ERROR_LOG: Out of resource in file
base/iof_base_setup.c at line 106
[ac27:12447] [0,0,1] ORTE_ERROR_LOG: Out of resource in file
odls_default_module.c at line 663
[ac27:12447] [0,0,1] ORTE_ERROR_LOG: Out of resource in file
odls_default_module.c at line 1191
[ac27:12447] [0,0,1] ORTE_ERROR_LOG: Out of resource in file orted.c at
line 594
[ac27:12442] spawn: in job_state_callback(jobid = 1, state = 0x80)
mpirun noticed that job rank 0 with PID 0 on node ac27 exited on signal
15 (Terminated).
[ac27:12447] sess_dir_finalize: job session dir not empty - leaving
[ac27:12447] sess_dir_finalize: proc session dir not empty - leaving
[ac27:12442] sess_dir_finalize: proc session dir not empty - leaving

Thanks,
Clement

Clement Kam Man Chu wrote:
> Hi,
>
> I am using openmpi 1.2.3 under ia64 machine. I typed "mpirun -d --tmpdir
> /home/565/kxc565/tmpdir -mca btl sm -np 400 ./testprogram". I found only
> the first process can use my parameter setting to store tmp file, but
> the second process used its default setting to store tmp file in /tmp
> directory. How can I change all processes stored in a directory I
> required? I have attached the message from openmpi for more in details.
> Thanks for any help.
>
> Cheers,
> Clement
>
>
> [ac27:27928] [0,0,0] setting up session dir with
> [ac27:27928] tmpdir /home/565/kxc565/tmpdir
> [ac27:27928] universe default-universe-27928
> [ac27:27928] user kxc565
> [ac27:27928] host ac27
> [ac27:27928] jobid 0
> [ac27:27928] procid 0
> [ac27:27928] procdir:
> /home/565/kxc565/tmpdir/openmpi-sessions-kxc565_at_ac27_0/default-universe-27928/0/0
> [ac27:27928] jobdir:
> /home/565/kxc565/tmpdir/openmpi-sessions-kxc565_at_ac27_0/default-universe-27928/0
> [ac27:27928] unidir:
> /home/565/kxc565/tmpdir/openmpi-sessions-kxc565_at_ac27_0/default-universe-27928
> [ac27:27928] top: openmpi-sessions-kxc565_at_ac27_0
> [ac27:27928] tmp: ?
> [ac27:27928] [0,0,0] contact_file
> /home/565/kxc565/tmpdir/openmpi-sessions-kxc565_at_ac27_0/default-universe-27928/universe-setup.txt
> [ac27:27928] [0,0,0] wrote setup file
> [ac27:27932] [0,0,1] setting up session dir with
> [ac27:27932] universe default-universe-27928
> [ac27:27932] user kxc565
> [ac27:27932] host ac27
> [ac27:27932] jobid 0
> [ac27:27932] procid 1
> [ac27:27932] procdir:
> /tmp/openmpi-sessions-kxc565_at_ac27_0/default-universe-27928/0/1
> [ac27:27932] jobdir:
> /tmp/openmpi-sessions-kxc565_at_ac27_0/default-universe-27928/0
> [ac27:27932] unidir:
> /tmp/openmpi-sessions-kxc565_at_ac27_0/default-universe-27928
> [ac27:27932] top: openmpi-sessions-kxc565_at_ac27_0
> [ac27:27932] tmp: /tmp
> [ac27:27932] [0,0,1] ORTE_ERROR_LOG: Out of resource in file
> base/iof_base_setup.c at line 106
> [ac27:27932] [0,0,1] ORTE_ERROR_LOG: Out of resource in file
> odls_default_module.c at line 663
> [ac27:27932] [0,0,1] ORTE_ERROR_LOG: Out of resource in file
> odls_default_module.c at line 1191
> [ac27:27932] [0,0,1] ORTE_ERROR_LOG: Out of resource in file orted.c at
> line 594
> [ac27:27928] spawn: in job_state_callback(jobid = 1, state = 0x80)
> mpirun noticed that job rank 0 with PID 0 on node ac27 exited on signal
> 15 (Terminated).
> [ac27:27932] sess_dir_finalize: job session dir not empty - leaving
> [ac27:27932] sess_dir_finalize: proc session dir not empty - leaving
> [ac27:27928] sess_dir_finalize: proc session dir not empty - leaving
>
>

-- 
Clement Kam Man Chu
Research Assistant
Faculty of Information Technology
Monash University, Caulfield Campus
Ph: 61 3 9903 2355