Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Clement Chu (clement.chu_at_[hidden])
Date: 2005-11-10 18:23:38


Thanks for your help. kfc is machine name and clement is the username
of this machine. Do you think it is the problem?

Then I tried to remove kfc machine and run again. This time I can run
mpi program and there is no error message output, but it is no program
output too. I think it is something wrong there.

[clement_at_localhost TestMPI]$ mpirun -d -np 2 test
[dhcppc0:02954] [0,0,0] setting up session dir with
[dhcppc0:02954] universe default-universe
[dhcppc0:02954] user clement
[dhcppc0:02954] host dhcppc0
[dhcppc0:02954] jobid 0
[dhcppc0:02954] procid 0
[dhcppc0:02954] procdir:
/tmp/openmpi-sessions-clement_at_dhcppc0_0/default-universe/0/0
[dhcppc0:02954] jobdir:
/tmp/openmpi-sessions-clement_at_dhcppc0_0/default-universe/0
[dhcppc0:02954] unidir:
/tmp/openmpi-sessions-clement_at_dhcppc0_0/default-universe[dhcppc0:02954]
top: openmpi-sessions-clement_at_dhcppc0_0
[dhcppc0:02954] tmp: /tmp
[dhcppc0:02954] [0,0,0] contact_file
/tmp/openmpi-sessions-clement_at_dhcppc0_0/default-universe/universe-setup.txt
[dhcppc0:02954] [0,0,0] wrote setup file
[dhcppc0:02954] spawn: in job_state_callback(jobid = 1, state = 0x1)
[dhcppc0:02954] pls:rsh: local csh: 0, local bash: 1
[dhcppc0:02954] pls:rsh: assuming same remote shell as local shell
[dhcppc0:02954] pls:rsh: remote csh: 0, remote bash: 1
[dhcppc0:02954] pls:rsh: final template argv:
[dhcppc0:02954] pls:rsh: ssh <template> orted --debug --bootproxy 1
--name <template> --num_procs 2 --vpid_start 0 --nodename <template>
--universe clement_at_dhcppc0:default-universe --nsreplica
"0.0.0;tcp://192.168.11.100:32780" --gprreplica
"0.0.0;tcp://192.168.11.100:32780" --mpi-call-yield 0
[dhcppc0:02954] pls:rsh: launching on node localhost
[dhcppc0:02954] pls:rsh: oversubscribed -- setting mpi_yield_when_idle
to 1 (1 2)
[dhcppc0:02954] pls:rsh: localhost is a LOCAL node
[dhcppc0:02954] pls:rsh: executing: orted --debug --bootproxy 1 --name
0.0.1 --num_procs 2 --vpid_start 0 --nodename localhost --universe
clement_at_dhcppc0:default-universe --nsreplica
"0.0.0;tcp://192.168.11.100:32780" --gprreplica
"0.0.0;tcp://192.168.11.100:32780" --mpi-call-yield 1
[dhcppc0:02955] [0,0,1] setting up session dir with
[dhcppc0:02955] universe default-universe
[dhcppc0:02955] user clement
[dhcppc0:02955] host localhost
[dhcppc0:02955] jobid 0
[dhcppc0:02955] procid 1
[dhcppc0:02955] procdir:
/tmp/openmpi-sessions-clement_at_localhost_0/default-universe/0/1
[dhcppc0:02955] jobdir:
/tmp/openmpi-sessions-clement_at_localhost_0/default-universe/0
[dhcppc0:02955] unidir:
/tmp/openmpi-sessions-clement_at_localhost_0/default-universe
[dhcppc0:02955] top: openmpi-sessions-clement_at_localhost_0
[dhcppc0:02955] tmp: /tmp
[dhcppc0:02955] sess_dir_finalize: proc session dir not empty - leaving
[dhcppc0:02955] sess_dir_finalize: proc session dir not empty - leaving
[dhcppc0:02955] orted: job_state_callback(jobid = 1, state =
ORTE_PROC_STATE_TERMINATED)
[dhcppc0:02955] sess_dir_finalize: found proc session dir empty - deleting
[dhcppc0:02955] sess_dir_finalize: found job session dir empty - deleting
[dhcppc0:02955] sess_dir_finalize: found univ session dir empty - deleting
[dhcppc0:02955] sess_dir_finalize: found top session dir empty - deleting
[dhcppc0:02954] spawn: in job_state_callback(jobid = 1, state = 0x9)
[dhcppc0:02954] sess_dir_finalize: found proc session dir empty - deleting
[dhcppc0:02954] sess_dir_finalize: found job session dir empty - deleting
[dhcppc0:02954] sess_dir_finalize: found univ session dir empty - deleting
[dhcppc0:02954] sess_dir_finalize: found top session dir empty - deleting

Clement

Jeff Squyres wrote:

>One minor thing that I notice in your ompi_info output -- your build
>and run machines are different (kfc vs. clement).
>
>Are these both FC4 machines, or are they different OS's/distros?
>
>
>On Nov 10, 2005, at 10:01 AM, Clement Chu wrote:
>
>
>
>>[clement_at_kfc TestMPI]$ mpirun -d -np 2 test
>>[kfc:29199] procdir: (null)
>>[kfc:29199] jobdir: (null)
>>[kfc:29199] unidir:
>>/tmp/openmpi-sessions-clement_at_kfc_0/default-universe
>>[kfc:29199] top: openmpi-sessions-clement_at_kfc_0
>>[kfc:29199] tmp: /tmp
>>[kfc:29199] [0,0,0] setting up session dir with
>>[kfc:29199] tmpdir /tmp
>>[kfc:29199] universe default-universe-29199
>>[kfc:29199] user clement
>>[kfc:29199] host kfc
>>[kfc:29199] jobid 0
>>[kfc:29199] procid 0
>>[kfc:29199] procdir:
>>/tmp/openmpi-sessions-clement_at_kfc_0/default-universe-29199/0/0
>>[kfc:29199] jobdir:
>>/tmp/openmpi-sessions-clement_at_kfc_0/default-universe-29199/0
>>[kfc:29199] unidir:
>>/tmp/openmpi-sessions-clement_at_kfc_0/default-universe-29199
>>[kfc:29199] top: openmpi-sessions-clement_at_kfc_0
>>[kfc:29199] tmp: /tmp
>>[kfc:29199] [0,0,0] contact_file
>>/tmp/openmpi-sessions-clement_at_kfc_0/default-universe-29199/universe-
>>setup.txt
>>[kfc:29199] [0,0,0] wrote setup file
>>[kfc:29199] pls:rsh: local csh: 0, local bash: 1
>>[kfc:29199] pls:rsh: assuming same remote shell as local shell
>>[kfc:29199] pls:rsh: remote csh: 0, remote bash: 1
>>[kfc:29199] pls:rsh: final template argv:
>>[kfc:29199] pls:rsh: ssh <template> orted --debug --bootproxy 1
>>--name <template> --num_procs 2 --vpid_start 0 --nodename <template>
>>--universe clement_at_kfc:default-universe-29199 --nsreplica
>>"0.0.0;tcp://192.168.11.101:32784" --gprreplica
>>"0.0.0;tcp://192.168.11.101:32784" --mpi-call-yield 0
>>[kfc:29199] pls:rsh: launching on node localhost
>>[kfc:29199] pls:rsh: oversubscribed -- setting mpi_yield_when_idle to 1
>>(1 2)
>>[kfc:29199] sess_dir_finalize: proc session dir not empty - leaving
>>[kfc:29199] spawn: in job_state_callback(jobid = 1, state = 0xa)
>>mpirun noticed that job rank 1 with PID 0 on node "localhost" exited on
>>signal 11.
>>[kfc:29199] sess_dir_finalize: proc session dir not empty - leaving
>>[kfc:29199] spawn: in job_state_callback(jobid = 1, state = 0x9)
>>[kfc:29199] ERROR: A daemon on node localhost failed to start as
>>expected.
>>[kfc:29199] ERROR: There may be more information available from
>>[kfc:29199] ERROR: the remote shell (see above).
>>[kfc:29199] The daemon received a signal 11.
>>1 additional process aborted (not shown)
>>[kfc:29199] sess_dir_finalize: found proc session dir empty - deleting
>>[kfc:29199] sess_dir_finalize: found job session dir empty - deleting
>>[kfc:29199] sess_dir_finalize: found univ session dir empty - deleting
>>[kfc:29199] sess_dir_finalize: top session dir not empty - leaving
>>
>>
>>opmi_info output message:
>>
>>[clement_at_kfc TestMPI]$ ompi_info
>> Open MPI: 1.0rc5r8053
>> Open MPI SVN revision: r8053
>> Open RTE: 1.0rc5r8053
>> Open RTE SVN revision: r8053
>> OPAL: 1.0rc5r8053
>> OPAL SVN revision: r8053
>> Prefix: /home/clement/openmpi
>> Configured architecture: i686-pc-linux-gnu
>> Configured by: clement
>> Configured on: Fri Nov 11 00:37:23 EST 2005
>> Configure host: kfc
>> Built by: clement
>> Built on: Fri Nov 11 00:59:26 EST 2005
>> Built host: kfc
>> C bindings: yes
>> C++ bindings: yes
>> Fortran77 bindings: yes (all)
>> Fortran90 bindings: yes
>> C compiler: gcc
>> C compiler absolute: /usr/bin/gcc
>> C++ compiler: g++
>> C++ compiler absolute: /usr/bin/g++
>> Fortran77 compiler: gfortran
>> Fortran77 compiler abs: /usr/bin/gfortran
>> Fortran90 compiler: gfortran
>> Fortran90 compiler abs: /usr/bin/gfortran
>> C profiling: yes
>> C++ profiling: yes
>> Fortran77 profiling: yes
>> Fortran90 profiling: yes
>> C++ exceptions: no
>> Thread support: posix (mpi: no, progress: no)
>> Internal debug support: no
>> MPI parameter check: runtime
>>Memory profiling support: no
>>Memory debugging support: no
>> libltdl support: 1
>> MCA memory: malloc_hooks (MCA v1.0, API v1.0, Component
>>v1.0)
>> MCA paffinity: linux (MCA v1.0, API v1.0, Component v1.0)
>> MCA maffinity: first_use (MCA v1.0, API v1.0, Component
>>v1.0)
>> MCA timer: linux (MCA v1.0, API v1.0, Component v1.0)
>> MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0)
>> MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0)
>> MCA coll: basic (MCA v1.0, API v1.0, Component v1.0)
>> MCA coll: self (MCA v1.0, API v1.0, Component v1.0)
>> MCA coll: sm (MCA v1.0, API v1.0, Component v1.0)
>> MCA io: romio (MCA v1.0, API v1.0, Component v1.0)
>> MCA mpool: sm (MCA v1.0, API v1.0, Component v1.0)
>> MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.0)
>> MCA pml: teg (MCA v1.0, API v1.0, Component v1.0)
>> MCA pml: uniq (MCA v1.0, API v1.0, Component v1.0)
>> MCA ptl: self (MCA v1.0, API v1.0, Component v1.0)
>> MCA ptl: sm (MCA v1.0, API v1.0, Component v1.0)
>> MCA ptl: tcp (MCA v1.0, API v1.0, Component v1.0)
>> MCA btl: self (MCA v1.0, API v1.0, Component v1.0)
>> MCA btl: sm (MCA v1.0, API v1.0, Component v1.0)
>> MCA btl: tcp (MCA v1.0, API v1.0, Component v1.0)
>> MCA topo: unity (MCA v1.0, API v1.0, Component v1.0)
>> MCA gpr: null (MCA v1.0, API v1.0, Component v1.0)
>> MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.0)
>> MCA gpr: replica (MCA v1.0, API v1.0, Component v1.0)
>> MCA iof: proxy (MCA v1.0, API v1.0, Component v1.0)
>> MCA iof: svc (MCA v1.0, API v1.0, Component v1.0)
>> MCA ns: proxy (MCA v1.0, API v1.0, Component v1.0)
>> MCA ns: replica (MCA v1.0, API v1.0, Component v1.0)
>> MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0)
>> MCA ras: dash_host (MCA v1.0, API v1.0, Component
>>v1.0)
>> MCA ras: hostfile (MCA v1.0, API v1.0, Component v1.0)
>> MCA ras: localhost (MCA v1.0, API v1.0, Component
>>v1.0)
>> MCA ras: slurm (MCA v1.0, API v1.0, Component v1.0)
>> MCA rds: hostfile (MCA v1.0, API v1.0, Component v1.0)
>> MCA rds: resfile (MCA v1.0, API v1.0, Component v1.0)
>> MCA rmaps: round_robin (MCA v1.0, API v1.0, Component
>>v1.0)
>> MCA rmgr: proxy (MCA v1.0, API v1.0, Component v1.0)
>> MCA rmgr: urm (MCA v1.0, API v1.0, Component v1.0)
>> MCA rml: oob (MCA v1.0, API v1.0, Component v1.0)
>> MCA pls: fork (MCA v1.0, API v1.0, Component v1.0)
>> MCA pls: proxy (MCA v1.0, API v1.0, Component v1.0)
>> MCA pls: rsh (MCA v1.0, API v1.0, Component v1.0)
>> MCA pls: slurm (MCA v1.0, API v1.0, Component v1.0)
>> MCA sds: env (MCA v1.0, API v1.0, Component v1.0)
>> MCA sds: pipe (MCA v1.0, API v1.0, Component v1.0)
>> MCA sds: seed (MCA v1.0, API v1.0, Component v1.0)
>> MCA sds: singleton (MCA v1.0, API v1.0, Component
>>v1.0)
>> MCA sds: slurm (MCA v1.0, API v1.0, Component v1.0)
>>[clement_at_kfc TestMPI]$
>>
>>
>
>
>

-- 
Clement Kam Man Chu
Research Assistant
School of Computer Science & Software Engineering
Monash University, Caulfield Campus
Ph: 61 3 9903 1964