Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] rankfiles in openmpi-1.7.4
From: Ralph Castain (rhc_at_[hidden])
Date: 2014-02-10 20:06:03


Hmmm...afraid there isn't much I can offer here, Siegmar. For whatever reason, hwloc is indicating it cannot bind processes on that architecture.

On Feb 9, 2014, at 12:08 PM, Siegmar Gross <Siegmar.Gross_at_[hidden]> wrote:

> Hi Ralph,
>
> thank you very much for your reply. I have changed my rankfile.
>
> rank 0=rs0 slot=0:0-1
> rank 1=rs0 slot=1
> rank 2=rs1 slot=0
> rank 3=rs1 slot=1
>
> Now I get the following output.
>
> rs0 openmpi_1.7.x_or_newer 108 mpiexec --report-bindings \
> --use-hwthread-cpus -np 4 -rf rf_rs0_rs1 hostname
> --------------------------------------------------------------------------
> Open MPI tried to bind a new process, but something went wrong. The
> process was killed without launching the target application. Your job
> will now abort.
>
> Local host: rs0
> Application name: /usr/local/bin/hostname
> Error message: hwloc indicates cpu binding cannot be enforced
> Location:
> ../../../../../openmpi-1.7.4/orte/mca/odls/default/odls_default_module.c:499
> --------------------------------------------------------------------------
> rs0 openmpi_1.7.x_or_newer 109
>
>
> Kind regards
>
> Siegmar
>
>
>
>
>>> today I tested rankfiles once more. The good news first: openmpi-1.7.4
>>> now supports my Sun M4000 server with Sparc VII processors on the
>>> command line.
>>>
>>> rs0 openmpi_1.7.x_or_newer 104 mpiexec --report-bindings -np 4 \
>>> --bind-to hwthread hostname
>>> [rs0.informatik.hs-fulda.de:06051] MCW rank 1 bound to
>>> socket 0[core 1[hwt 0]]: [../B./../..][../../../..]
>>> [rs0.informatik.hs-fulda.de:06051] MCW rank 2 bound to
>>> socket 1[core 4[hwt 0]]: [../../../..][B./../../..]
>>> [rs0.informatik.hs-fulda.de:06051] MCW rank 3 bound to
>>> socket 1[core 5[hwt 0]]: [../../../..][../B./../..]
>>> [rs0.informatik.hs-fulda.de:06051] MCW rank 0 bound to
>>> socket 0[core 0[hwt 0]]: [B./../../..][../../../..]
>>> rs0.informatik.hs-fulda.de
>>> rs0.informatik.hs-fulda.de
>>> rs0.informatik.hs-fulda.de
>>> rs0.informatik.hs-fulda.de
>>> rs0 openmpi_1.7.x_or_newer 105
>>>
>>> Thank you very much for solving this problem. Unfortunately I still
>>> have a problem with a rankfile. Contents of my rankfile:
>>>
>>> rank 0=rs0 slot=0:0-7
>>> rank 1=rs0 slot=1
>>> rank 2=rs1 slot=0
>>> rank 3=rs1 slot=1
>>>
>>
>>
>> Here's your problem - you told us socket 0, cores 0-7. However, if
>> you look at your topology, you only have *4* cores in socket 0
>>
>>
>>>
>>> rs0 openmpi_1.7.x_or_newer 105 mpiexec --report-bindings \
>>> --use-hwthread-cpus -np 4 -rf rf_rs0_rs1 hostname
>>> [rs0.informatik.hs-fulda.de:06060] [[7659,0],0] ORTE_ERROR_LOG: Not
>>> found in file
>>> .../openmpi-1.7.4/orte/mca/rmaps/rank_file/rmaps_rank_file.c
>>> at line 283
>>> [rs0.informatik.hs-fulda.de:06060] [[7659,0],0] ORTE_ERROR_LOG: Not
>>> found in file
>>> .../openmpi-1.7.4/orte/mca/rmaps/base/rmaps_base_map_job.c
>>> at line 284
>>> rs0 openmpi_1.7.x_or_newer 106
>>>
>>>
>>> rs0 openmpi_1.7.x_or_newer 110 mpiexec --report-bindings \
>>> --display-allocation --mca rmaps_base_verbose_100 \
>>> --use-hwthread-cpus -np 4 -rf rf_rs0_rs1 hostname
>>>
>>> ====================== ALLOCATED NODES ======================
>>> rs0: slots=2 max_slots=0 slots_inuse=0
>>> rs1: slots=2 max_slots=0 slots_inuse=0
>>> =================================================================
>>> [rs0.informatik.hs-fulda.de:06074] [[7677,0],0] ORTE_ERROR_LOG: Not found in
> file
>>> ../../../../../openmpi-1.7.4/orte/mca/rmaps/rank_file/rmaps_rank_file.c at
> line 283
>>> [rs0.informatik.hs-fulda.de:06074] [[7677,0],0] ORTE_ERROR_LOG: Not found in
> file
>>> ../../../../openmpi-1.7.4/orte/mca/rmaps/base/rmaps_base_map_job.c at line
> 284
>>> rs0 openmpi_1.7.x_or_newer 111
>>>
>>>
>>> rs0 openmpi_1.7.x_or_newer 111 mpiexec --report-bindings
> --display-allocation --mca ess_base_verbose 5 --use-hwthread-cpus -np
>>> 4 -rf rf_rs0_rs1 hostname
>>> [rs0.informatik.hs-fulda.de:06078] mca:base:select:( ess) Querying
> component [env]
>>> [rs0.informatik.hs-fulda.de:06078] mca:base:select:( ess) Skipping
> component [env]. Query failed to return a module
>>> [rs0.informatik.hs-fulda.de:06078] mca:base:select:( ess) Querying
> component [hnp]
>>> [rs0.informatik.hs-fulda.de:06078] mca:base:select:( ess) Query of
> component [hnp] set priority to 100
>>> [rs0.informatik.hs-fulda.de:06078] mca:base:select:( ess) Querying
> component [singleton]
>>> [rs0.informatik.hs-fulda.de:06078] mca:base:select:( ess) Skipping
> component [singleton]. Query failed to return a module
>>> [rs0.informatik.hs-fulda.de:06078] mca:base:select:( ess) Querying
> component [tool]
>>> [rs0.informatik.hs-fulda.de:06078] mca:base:select:( ess) Skipping
> component [tool]. Query failed to return a module
>>> [rs0.informatik.hs-fulda.de:06078] mca:base:select:( ess) Selected
> component [hnp]
>>> [rs0.informatik.hs-fulda.de:06078] [[INVALID],INVALID] Topology Info:
>>> [rs0.informatik.hs-fulda.de:06078] Type: Machine Number of child objects: 1
>>> Name=NULL
>>> total=33554432KB
>>> Backend=Solaris
>>> OSName=SunOS
>>> OSRelease=5.10
>>> OSVersion=Generic_150400-04
>>> Architecture=sun4u
>>> Cpuset: 0x0000ffff
>>> Online: 0x0000ffff
>>> Allowed: 0x0000ffff
>>> Bind CPU proc: TRUE
>>> Bind CPU thread: TRUE
>>> Bind MEM proc: TRUE
>>> Bind MEM thread: TRUE
>>> Type: NUMANode Number of child objects: 2
>>> Name=NULL
>>> local=33554432KB
>>> total=33554432KB
>>> Cpuset: 0x0000ffff
>>> Online: 0x0000ffff
>>> Allowed: 0x0000ffff
>>> Type: Socket Number of child objects: 4
>>> Name=NULL
>>> CPUType=sparcv9
>>> CPUModel=SPARC64_VII
>>> Cpuset: 0x000000ff
>>> Online: 0x000000ff
>>> Allowed: 0x000000ff
>>> Type: Core Number of child objects: 2
>>> Name=NULL
>>> Cpuset: 0x00000003
>>> Online: 0x00000003
>>> Allowed: 0x00000003
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000001
>>> Online: 0x00000001
>>> Allowed: 0x00000001
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000002
>>> Online: 0x00000002
>>> Allowed: 0x00000002
>>> Type: Core Number of child objects: 2
>>> Name=NULL
>>> Cpuset: 0x0000000c
>>> Online: 0x0000000c
>>> Allowed: 0x0000000c
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000004
>>> Online: 0x00000004
>>> Allowed: 0x00000004
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000008
>>> Online: 0x00000008
>>> Allowed: 0x00000008
>>> Type: Core Number of child objects: 2
>>> Name=NULL
>>> Cpuset: 0x00000030
>>> Online: 0x00000030
>>> Allowed: 0x00000030
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000010
>>> Online: 0x00000010
>>> Allowed: 0x00000010
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000020
>>> Online: 0x00000020
>>> Allowed: 0x00000020
>>> Type: Core Number of child objects: 2
>>> Name=NULL
>>> Cpuset: 0x000000c0
>>> Online: 0x000000c0
>>> Allowed: 0x000000c0
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000040
>>> Online: 0x00000040
>>> Allowed: 0x00000040
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000080
>>> Online: 0x00000080
>>> Allowed: 0x00000080
>>> Type: Socket Number of child objects: 4
>>> Name=NULL
>>> CPUType=sparcv9
>>> CPUModel=SPARC64_VII
>>> Cpuset: 0x0000ff00
>>> Online: 0x0000ff00
>>> Allowed: 0x0000ff00
>>> Type: Core Number of child objects: 2
>>> Name=NULL
>>> Cpuset: 0x00000300
>>> Online: 0x00000300
>>> Allowed: 0x00000300
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000100
>>> Online: 0x00000100
>>> Allowed: 0x00000100
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000200
>>> Online: 0x00000200
>>> Allowed: 0x00000200
>>> Type: Core Number of child objects: 2
>>> Name=NULL
>>> Cpuset: 0x00000c00
>>> Online: 0x00000c00
>>> Allowed: 0x00000c00
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000400
>>> Online: 0x00000400
>>> Allowed: 0x00000400
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000800
>>> Online: 0x00000800
>>> Allowed: 0x00000800
>>> Type: Core Number of child objects: 2
>>> Name=NULL
>>> Cpuset: 0x00003000
>>> Online: 0x00003000
>>> Allowed: 0x00003000
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00001000
>>> Online: 0x00001000
>>> Allowed: 0x00001000
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00002000
>>> Online: 0x00002000
>>> Allowed: 0x00002000
>>> Type: Core Number of child objects: 2
>>> Name=NULL
>>> Cpuset: 0x0000c000
>>> Online: 0x0000c000
>>> Allowed: 0x0000c000
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00004000
>>> Online: 0x00004000
>>> Allowed: 0x00004000
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00008000
>>> Online: 0x00008000
>>> Allowed: 0x00008000
>>> [rs1.informatik.hs-fulda.de:09657] mca:base:select:( ess) Querying
> component [env]
>>> [rs1.informatik.hs-fulda.de:09657] mca:base:select:( ess) Query of
> component [env] set priority to 20
>>> [rs1.informatik.hs-fulda.de:09657] mca:base:select:( ess) Selected
> component [env]
>>> [rs1.informatik.hs-fulda.de:09657] ess:env set name to [[7673,0],1]
>>> [rs1.informatik.hs-fulda.de:09657] [[7673,0],1] Topology Info:
>>> [rs1.informatik.hs-fulda.de:09657] Type: Machine Number of child objects: 1
>>> Name=NULL
>>> total=33554432KB
>>> Backend=Solaris
>>> OSName=SunOS
>>> OSRelease=5.10
>>> OSVersion=Generic_150400-04
>>> Architecture=sun4u
>>> Cpuset: 0x0000ffff
>>> Online: 0x0000ffff
>>> Allowed: 0x0000ffff
>>> Bind CPU proc: TRUE
>>> Bind CPU thread: TRUE
>>> Bind MEM proc: TRUE
>>> Bind MEM thread: TRUE
>>> Type: NUMANode Number of child objects: 2
>>> Name=NULL
>>> local=33554432KB
>>> total=33554432KB
>>> Cpuset: 0x0000ffff
>>> Online: 0x0000ffff
>>> Allowed: 0x0000ffff
>>> Type: Socket Number of child objects: 4
>>> Name=NULL
>>> CPUType=sparcv9
>>> CPUModel=SPARC64_VII
>>> Cpuset: 0x000000ff
>>> Online: 0x000000ff
>>> Allowed: 0x000000ff
>>> Type: Core Number of child objects: 2
>>> Name=NULL
>>> Cpuset: 0x00000003
>>> Online: 0x00000003
>>> Allowed: 0x00000003
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000001
>>> Online: 0x00000001
>>> Allowed: 0x00000001
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000002
>>> Online: 0x00000002
>>> Allowed: 0x00000002
>>> Type: Core Number of child objects: 2
>>> Name=NULL
>>> Cpuset: 0x0000000c
>>> Online: 0x0000000c
>>> Allowed: 0x0000000c
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000004
>>> Online: 0x00000004
>>> Allowed: 0x00000004
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000008
>>> Online: 0x00000008
>>> Allowed: 0x00000008
>>> Type: Core Number of child objects: 2
>>> Name=NULL
>>> Cpuset: 0x00000030
>>> Online: 0x00000030
>>> Allowed: 0x00000030
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000010
>>> Online: 0x00000010
>>> Allowed: 0x00000010
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000020
>>> Online: 0x00000020
>>> Allowed: 0x00000020
>>> Type: Core Number of child objects: 2
>>> Name=NULL
>>> Cpuset: 0x000000c0
>>> Online: 0x000000c0
>>> Allowed: 0x000000c0
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000040
>>> Online: 0x00000040
>>> Allowed: 0x00000040
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000080
>>> Online: 0x00000080
>>> Allowed: 0x00000080
>>> Type: Socket Number of child objects: 4
>>> Name=NULL
>>> CPUType=sparcv9
>>> CPUModel=SPARC64_VII
>>> Cpuset: 0x0000ff00
>>> Online: 0x0000ff00
>>> Allowed: 0x0000ff00
>>> Type: Core Number of child objects: 2
>>> Name=NULL
>>> Cpuset: 0x00000300
>>> Online: 0x00000300
>>> Allowed: 0x00000300
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000100
>>> Online: 0x00000100
>>> Allowed: 0x00000100
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000200
>>> Online: 0x00000200
>>> Allowed: 0x00000200
>>> Type: Core Number of child objects: 2
>>> Name=NULL
>>> Cpuset: 0x00000c00
>>> Online: 0x00000c00
>>> Allowed: 0x00000c00
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000400
>>> Online: 0x00000400
>>> Allowed: 0x00000400
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000800
>>> Online: 0x00000800
>>> Allowed: 0x00000800
>>> Type: Core Number of child objects: 2
>>> Name=NULL
>>> Cpuset: 0x00003000
>>> Online: 0x00003000
>>> Allowed: 0x00003000
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00001000
>>> Online: 0x00001000
>>> Allowed: 0x00001000
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00002000
>>> Online: 0x00002000
>>> Allowed: 0x00002000
>>> Type: Core Number of child objects: 2
>>> Name=NULL
>>> Cpuset: 0x0000c000
>>> Online: 0x0000c000
>>> Allowed: 0x0000c000
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00004000
>>> Online: 0x00004000
>>> Allowed: 0x00004000
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00008000
>>> Online: 0x00008000
>>> Allowed: 0x00008000
>>>
>>> ====================== ALLOCATED NODES ======================
>>> rs0: slots=2 max_slots=0 slots_inuse=0
>>> rs1: slots=2 max_slots=0 slots_inuse=0
>>> =================================================================
>>> [rs0.informatik.hs-fulda.de:06078] [[7673,0],0] ORTE_ERROR_LOG: Not found in
> file
>>> ../../../../../openmpi-1.7.4/orte/mca/rmaps/rank_file/rmaps_rank_file.c at
> line 283
>>> [rs0.informatik.hs-fulda.de:06078] [[7673,0],0] ORTE_ERROR_LOG: Not found in
> file
>>> ../../../../openmpi-1.7.4/orte/mca/rmaps/base/rmaps_base_map_job.c at line
> 284
>>> [rs1.informatik.hs-fulda.de:09657] [[7673,0],1] setting up session dir with
>>> tmpdir: UNDEF
>>> host rs1
>>> rs0 openmpi_1.7.x_or_newer 112
>>>
>>>
>>>
>>>
>>> rs0 openmpi_1.7.x_or_newer 113 mpiexec --report-bindings
> --display-allocation --mca plm_base_verbose 100 --use-hwthread-cpus
>>> -np 4 -rf rf_rs0_rs1 hostname
>>> [rs0.informatik.hs-fulda.de:06088] mca: base: components_register:
> registering plm components
>>> [rs0.informatik.hs-fulda.de:06088] mca: base: components_register: found
> loaded component rsh
>>> [rs0.informatik.hs-fulda.de:06088] mca: base: components_register: component
> rsh register function successful
>>> [rs0.informatik.hs-fulda.de:06088] mca: base: components_open: opening plm
> components
>>> [rs0.informatik.hs-fulda.de:06088] mca: base: components_open: found loaded
> component rsh
>>> [rs0.informatik.hs-fulda.de:06088] mca: base: components_open: component rsh
> open function successful
>>> [rs0.informatik.hs-fulda.de:06088] mca:base:select: Auto-selecting plm
> components
>>> [rs0.informatik.hs-fulda.de:06088] mca:base:select:( plm) Querying
> component [rsh]
>>> [rs0.informatik.hs-fulda.de:06088] [[INVALID],INVALID] plm:rsh_lookup on
> agent ssh : rsh path NULL
>>> [rs0.informatik.hs-fulda.de:06088] mca:base:select:( plm) Query of
> component [rsh] set priority to 10
>>> [rs0.informatik.hs-fulda.de:06088] mca:base:select:( plm) Selected
> component [rsh]
>>> [rs0.informatik.hs-fulda.de:06088] plm:base:set_hnp_name: initial bias 6088
> nodename hash 3909477186
>>> [rs0.informatik.hs-fulda.de:06088] plm:base:set_hnp_name: final jobfam 7567
>>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:rsh_setup on agent ssh :
> rsh path NULL
>>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:base:receive start comm
>>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:base:setup_job
>>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:base:setup_vm
>>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:base:setup_vm creating
> map
>>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] setup:vm: working unmanaged
> allocation
>>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] using rankfile rf_rs0_rs1
>>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] checking node rs0
>>>
>>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] ignoring myself
>>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] checking node rs1
>>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:base:setup_vm add new
> daemon [[7567,0],1]
>>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:base:setup_vm assigning
> new daemon [[7567,0],1] to node rs1
>>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:rsh: launching vm
>>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:rsh: local shell: 2
> (tcsh)
>>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:rsh: assuming same
> remote shell as local shell
>>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:rsh: remote shell: 2
> (tcsh)
>>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:rsh: final template
> argv:
>>> /usr/local/bin/ssh <template> orted -mca orte_report_bindings 1 -mca
> ess env -mca orte_ess_jobid 495910912 -mca
>>> orte_ess_vpid <template> -mca orte_ess_num_procs 2 -mca orte_hnp_uri
>>> "495910912.0;tcp://193.174.26.198,192.168.128.1,10.1.1.2:43810" --tree-spawn
> --mca plm_base_verbose 100 -mca plm rsh -mca
>>> orte_rankfile rf_rs0_rs1 -mca hwloc_base_use_hwthreads_as_cpus 1 -mca
> orte_display_alloc 1 -mca hwloc_base_report_bindings 1
>>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:rsh:launch daemon 0 not
> a child of mine
>>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:rsh: adding node rs1 to
> launch list
>>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:rsh: activating launch
> event
>>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:rsh: recording launch of
> daemon [[7567,0],1]
>>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:rsh: executing:
> (/usr/local/bin/ssh) [/usr/local/bin/ssh rs1 orted -mca
>>> orte_report_bindings 1 -mca ess env -mca orte_ess_jobid 495910912 -mca
> orte_ess_vpid 1 -mca orte_ess_num_procs 2 -mca
>>> orte_hnp_uri "495910912.0;tcp://193.174.26.198,192.168.128.1,10.1.1.2:43810"
> --tree-spawn --mca plm_base_verbose 100 -mca plm
>>> rsh -mca orte_rankfile rf_rs0_rs1 -mca hwloc_base_use_hwthreads_as_cpus 1
> -mca orte_display_alloc 1 -mca
>>> hwloc_base_report_bindings 1]
>>> Warning: untrusted X11 forwarding setup failed: xauth key data not generated
>>> Warning: No xauth data; using fake authentication data for X11 forwarding.
>>> [rs1.informatik.hs-fulda.de:09721] mca: base: components_register:
> registering plm components
>>> [rs1.informatik.hs-fulda.de:09721] mca: base: components_register: found
> loaded component rsh
>>> [rs1.informatik.hs-fulda.de:09721] mca: base: components_register: component
> rsh register function successful
>>> [rs1.informatik.hs-fulda.de:09721] mca: base: components_open: opening plm
> components
>>> [rs1.informatik.hs-fulda.de:09721] mca: base: components_open: found loaded
> component rsh
>>> [rs1.informatik.hs-fulda.de:09721] mca: base: components_open: component rsh
> open function successful
>>> [rs1.informatik.hs-fulda.de:09721] mca:base:select: Auto-selecting plm
> components
>>> [rs1.informatik.hs-fulda.de:09721] mca:base:select:( plm) Querying
> component [rsh]
>>> [rs1.informatik.hs-fulda.de:09721] [[7567,0],1] plm:rsh_lookup on agent ssh
> : rsh path NULL
>>> [rs1.informatik.hs-fulda.de:09721] mca:base:select:( plm) Query of
> component [rsh] set priority to 10
>>> [rs1.informatik.hs-fulda.de:09721] mca:base:select:( plm) Selected
> component [rsh]
>>> [rs1.informatik.hs-fulda.de:09721] [[7567,0],1] plm:rsh_setup on agent ssh :
> rsh path NULL
>>> [rs1.informatik.hs-fulda.de:09721] [[7567,0],1] plm:base:receive start comm
>>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:base:orted_report_launch
> from daemon [[7567,0],1]
>>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:base:orted_report_launch
> from daemon [[7567,0],1] on node rs1
>>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] RECEIVED TOPOLOGY FROM NODE
> rs1
>>> [rs0.informatik.hs-fulda.de:06088] Type: Machine Number of child objects: 1
>>> Name=NULL
>>> total=33554432KB
>>> Backend=Solaris
>>> OSName=SunOS
>>> OSRelease=5.10
>>> OSVersion=Generic_150400-04
>>> Architecture=sun4u
>>> Cpuset: 0x0000ffff
>>> Online: 0x0000ffff
>>> Allowed: 0x0000ffff
>>> Bind CPU proc: TRUE
>>> Bind CPU thread: TRUE
>>> Bind MEM proc: TRUE
>>> Bind MEM thread: TRUE
>>> Type: NUMANode Number of child objects: 2
>>> Name=NULL
>>> local=33554432KB
>>> total=33554432KB
>>> Cpuset: 0x0000ffff
>>> Online: 0x0000ffff
>>> Allowed: 0x0000ffff
>>> Type: Socket Number of child objects: 4
>>> Name=NULL
>>> CPUType=sparcv9
>>> CPUModel=SPARC64_VII
>>> Cpuset: 0x000000ff
>>> Online: 0x000000ff
>>> Allowed: 0x000000ff
>>> Type: Core Number of child objects: 2
>>> Name=NULL
>>> Cpuset: 0x00000003
>>> Online: 0x00000003
>>> Allowed: 0x00000003
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000001
>>> Online: 0x00000001
>>> Allowed: 0x00000001
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000002
>>> Online: 0x00000002
>>> Allowed: 0x00000002
>>> Type: Core Number of child objects: 2
>>> Name=NULL
>>> Cpuset: 0x0000000c
>>> Online: 0x0000000c
>>> Allowed: 0x0000000c
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000004
>>> Online: 0x00000004
>>> Allowed: 0x00000004
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000008
>>> Online: 0x00000008
>>> Allowed: 0x00000008
>>> Type: Core Number of child objects: 2
>>> Name=NULL
>>> Cpuset: 0x00000030
>>> Online: 0x00000030
>>> Allowed: 0x00000030
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000010
>>> Online: 0x00000010
>>> Allowed: 0x00000010
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000020
>>> Online: 0x00000020
>>> Allowed: 0x00000020
>>> Type: Core Number of child objects: 2
>>> Name=NULL
>>> Cpuset: 0x000000c0
>>> Online: 0x000000c0
>>> Allowed: 0x000000c0
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000040
>>> Online: 0x00000040
>>> Allowed: 0x00000040
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000080
>>> Online: 0x00000080
>>> Allowed: 0x00000080
>>> Type: Socket Number of child objects: 4
>>> Name=NULL
>>> CPUType=sparcv9
>>> CPUModel=SPARC64_VII
>>> Cpuset: 0x0000ff00
>>> Online: 0x0000ff00
>>> Allowed: 0x0000ff00
>>> Type: Core Number of child objects: 2
>>> Name=NULL
>>> Cpuset: 0x00000300
>>> Online: 0x00000300
>>> Allowed: 0x00000300
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000100
>>> Online: 0x00000100
>>> Allowed: 0x00000100
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000200
>>> Online: 0x00000200
>>> Allowed: 0x00000200
>>> Type: Core Number of child objects: 2
>>> Name=NULL
>>> Cpuset: 0x00000c00
>>> Online: 0x00000c00
>>> Allowed: 0x00000c00
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000400
>>> Online: 0x00000400
>>> Allowed: 0x00000400
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000800
>>> Online: 0x00000800
>>> Allowed: 0x00000800
>>> Type: Core Number of child objects: 2
>>> Name=NULL
>>> Cpuset: 0x00003000
>>> Online: 0x00003000
>>> Allowed: 0x00003000
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00001000
>>> Online: 0x00001000
>>> Allowed: 0x00001000
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00002000
>>> Online: 0x00002000
>>> Allowed: 0x00002000
>>> Type: Core Number of child objects: 2
>>> Name=NULL
>>> Cpuset: 0x0000c000
>>> Online: 0x0000c000
>>> Allowed: 0x0000c000
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00004000
>>> Online: 0x00004000
>>> Allowed: 0x00004000
>>> Type: PU Number of child objects: 0
>>> Name=NULL
>>> Cpuset: 0x00008000
>>> Online: 0x00008000
>>> Allowed: 0x00008000
>>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] TOPOLOGY MATCHES -
> DISCARDING
>>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:base:orted_report_launch
> completed for daemon [[7567,0],1] at contact
>>> 495910912.1;tcp://193.174.26.199,192.168.128.2,10.1.1.2:37231
>>>
>>> ====================== ALLOCATED NODES ======================
>>> rs0: slots=2 max_slots=0 slots_inuse=0
>>> rs1: slots=2 max_slots=0 slots_inuse=0
>>> =================================================================
>>> [rs1.informatik.hs-fulda.de:09721] [[7567,0],1] plm:rsh: remote spawn called
>>> [rs1.informatik.hs-fulda.de:09721] [[7567,0],1] plm:rsh: remote spawn - have
> no children!
>>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] ORTE_ERROR_LOG: Not found in
> file
>>> ../../../../../openmpi-1.7.4/orte/mca/rmaps/rank_file/rmaps_rank_file.c at
> line 283
>>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] ORTE_ERROR_LOG: Not found in
> file
>>> ../../../../openmpi-1.7.4/orte/mca/rmaps/base/rmaps_base_map_job.c at line
> 284
>>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:base:orted_cmd sending
> orted_exit commands
>>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:base:receive stop comm
>>> [rs0.informatik.hs-fulda.de:06088] mca: base: close: component rsh closed
>>> [rs0.informatik.hs-fulda.de:06088] mca: base: close: unloading component rsh
>>> [rs1.informatik.hs-fulda.de:09721] [[7567,0],1] plm:base:receive stop comm
>>> [rs1.informatik.hs-fulda.de:09721] mca: base: close: component rsh closed
>>> [rs1.informatik.hs-fulda.de:09721] mca: base: close: unloading component rsh
>>> rs0 openmpi_1.7.x_or_newer 114
>>>
>>>
>>>
>>>
>>> I still have the problem that I get no output if I mix little and
>>> big endian machines, which works for openmpi-1.6.x.
>>>
>>> linpc1 openmpi_1.7.x_or_newer 112 mpiexec -report-bindings -np 4 \
>>> -rf rf_linpc_sunpc_tyr hostname
>>> linpc1 openmpi_1.7.x_or_newer 113
>>>
>>>
>>>
>>> linpc1 openmpi_1.7.x_or_newer 188 mpiexec -report-bindings
> --display-allocation --mca plm_base_verbose 100 -np 1 -rf
>>> rf_linpc_sunpc_tyr hostname
>>> [linpc1:20650] mca: base: components_register: registering plm components
>>> [linpc1:20650] mca: base: components_register: found loaded component rsh
>>> [linpc1:20650] mca: base: components_register: component rsh register
> function successful
>>> [linpc1:20650] mca: base: components_register: found loaded component slurm
>>> [linpc1:20650] mca: base: components_register: component slurm register
> function successful
>>> [linpc1:20650] mca: base: components_open: opening plm components
>>> [linpc1:20650] mca: base: components_open: found loaded component rsh
>>> [linpc1:20650] mca: base: components_open: component rsh open function
> successful
>>> [linpc1:20650] mca: base: components_open: found loaded component slurm
>>> [linpc1:20650] mca: base: components_open: component slurm open function
> successful
>>> [linpc1:20650] mca:base:select: Auto-selecting plm components
>>> [linpc1:20650] mca:base:select:( plm) Querying component [rsh]
>>> [linpc1:20650] [[INVALID],INVALID] plm:rsh_lookup on agent ssh : rsh path
> NULL
>>> [linpc1:20650] mca:base:select:( plm) Query of component [rsh] set priority
> to 10
>>> [linpc1:20650] mca:base:select:( plm) Querying component [slurm]
>>> [linpc1:20650] mca:base:select:( plm) Skipping component [slurm]. Query
> failed to return a module
>>> [linpc1:20650] mca:base:select:( plm) Selected component [rsh]
>>> [linpc1:20650] mca: base: close: component slurm closed
>>> [linpc1:20650] mca: base: close: unloading component slurm
>>> [linpc1:20650] plm:base:set_hnp_name: initial bias 20650 nodename hash
> 3902177415
>>> [linpc1:20650] plm:base:set_hnp_name: final jobfam 14523
>>> [linpc1:20650] [[14523,0],0] plm:rsh_setup on agent ssh : rsh path NULL
>>> [linpc1:20650] [[14523,0],0] plm:base:receive start comm
>>> [linpc1:20650] [[14523,0],0] plm:base:setup_job
>>> [linpc1:20650] [[14523,0],0] plm:base:setup_vm
>>> [linpc1:20650] [[14523,0],0] plm:base:setup_vm creating map
>>> [linpc1:20650] [[14523,0],0] setup:vm: working unmanaged allocation
>>> [linpc1:20650] [[14523,0],0] using rankfile rf_linpc_sunpc_tyr
>>> [linpc1:20650] [[14523,0],0] checking node linpc0
>>> [linpc1:20650] [[14523,0],0] checking node linpc1
>>> [linpc1:20650] [[14523,0],0] ignoring myself
>>> [linpc1:20650] [[14523,0],0] checking node sunpc1
>>> [linpc1:20650] [[14523,0],0] checking node tyr
>>> [linpc1:20650] [[14523,0],0] plm:base:setup_vm add new daemon [[14523,0],1]
>>> [linpc1:20650] [[14523,0],0] plm:base:setup_vm assigning new daemon
> [[14523,0],1] to node linpc0
>>> [linpc1:20650] [[14523,0],0] plm:base:setup_vm add new daemon [[14523,0],2]
>>> [linpc1:20650] [[14523,0],0] plm:base:setup_vm assigning new daemon
> [[14523,0],2] to node sunpc1
>>> [linpc1:20650] [[14523,0],0] plm:base:setup_vm add new daemon [[14523,0],3]
>>> [linpc1:20650] [[14523,0],0] plm:base:setup_vm assigning new daemon
> [[14523,0],3] to node tyr
>>> [linpc1:20650] [[14523,0],0] plm:rsh: launching vm
>>> [linpc1:20650] [[14523,0],0] plm:rsh: local shell: 2 (tcsh)
>>> [linpc1:20650] [[14523,0],0] plm:rsh: assuming same remote shell as local
> shell
>>> [linpc1:20650] [[14523,0],0] plm:rsh: remote shell: 2 (tcsh)
>>> [linpc1:20650] [[14523,0],0] plm:rsh: final template argv:
>>> /usr/local/bin/ssh <template> orted -mca orte_report_bindings 1 -mca
> ess env -mca orte_ess_jobid 951779328 -mca
>>> orte_ess_vpid <template> -mca orte_ess_num_procs 4 -mca orte_hnp_uri
> "951779328.0;tcp://193.174.26.208:46876" --tree-spawn
>>> --mca plm_base_verbose 100 -mca plm rsh -mca hwloc_base_report_bindings 1
> -mca orte_display_alloc 1 -mca orte_rankfile
>>> rf_linpc_sunpc_tyr
>>> [linpc1:20650] [[14523,0],0] plm:rsh:launch daemon 0 not a child of mine
>>> [linpc1:20650] [[14523,0],0] plm:rsh: adding node linpc0 to launch list
>>> [linpc1:20650] [[14523,0],0] plm:rsh: adding node sunpc1 to launch list
>>> [linpc1:20650] [[14523,0],0] plm:rsh:launch daemon 3 not a child of mine
>>> [linpc1:20650] [[14523,0],0] plm:rsh: activating launch event
>>> [linpc1:20650] [[14523,0],0] plm:rsh: recording launch of daemon
> [[14523,0],1]
>>> [linpc1:20650] [[14523,0],0] plm:rsh: recording launch of daemon
> [[14523,0],2]
>>> [linpc1:20650] [[14523,0],0] plm:rsh: executing: (/usr/local/bin/ssh)
> [/usr/local/bin/ssh sunpc1 orted -mca
>>> orte_report_bindings 1 -mca ess env -mca orte_ess_jobid 951779328 -mca
> orte_ess_vpid 2 -mca orte_ess_num_procs 4 -mca
>>> orte_hnp_uri "951779328.0;tcp://193.174.26.208:46876" --tree-spawn --mca
> plm_base_verbose 100 -mca plm rsh -mca
>>> hwloc_base_report_bindings 1 -mca orte_display_alloc 1 -mca orte_rankfile
> rf_linpc_sunpc_tyr]
>>> [linpc1:20650] [[14523,0],0] plm:rsh: executing: (/usr/local/bin/ssh)
> [/usr/local/bin/ssh linpc0 orted -mca
>>> orte_report_bindings 1 -mca ess env -mca orte_ess_jobid 951779328 -mca
> orte_ess_vpid 1 -mca orte_ess_num_procs 4 -mca
>>> orte_hnp_uri "951779328.0;tcp://193.174.26.208:46876" --tree-spawn --mca
> plm_base_verbose 100 -mca plm rsh -mca
>>> hwloc_base_report_bindings 1 -mca orte_display_alloc 1 -mca orte_rankfile
> rf_linpc_sunpc_tyr]
>>> Warning: untrusted X11 forwarding setup failed: xauth key data not generated
>>> Warning: No xauth data; using fake authentication data for X11 forwarding.
>>> X11 forwarding request failed on channel 0
>>> Warning: untrusted X11 forwarding setup failed: xauth key data not generated
>>> Warning: No xauth data; using fake authentication data for X11 forwarding.
>>> [sunpc1:09408] mca: base: components_register: registering plm components
>>> [sunpc1:09408] mca: base: components_register: found loaded component rsh
>>> [sunpc1:09408] mca: base: components_register: component rsh register
> function successful
>>> [sunpc1:09408] mca: base: components_open: opening plm components
>>> [sunpc1:09408] mca: base: components_open: found loaded component rsh
>>> [sunpc1:09408] mca: base: components_open: component rsh open function
> successful
>>> [sunpc1:09408] mca:base:select: Auto-selecting plm components
>>> [sunpc1:09408] mca:base:select:( plm) Querying component [rsh]
>>> [sunpc1:09408] [[14523,0],2] plm:rsh_lookup on agent ssh : rsh path NULL
>>> [sunpc1:09408] mca:base:select:( plm) Query of component [rsh] set priority
> to 10
>>> [sunpc1:09408] mca:base:select:( plm) Selected component [rsh]
>>> [sunpc1:09408] [[14523,0],2] plm:rsh_setup on agent ssh : rsh path NULL
>>> [sunpc1:09408] [[14523,0],2] plm:base:receive start comm
>>> [linpc1:20650] [[14523,0],0] plm:base:orted_report_launch from daemon
> [[14523,0],2]
>>> [linpc1:20650] [[14523,0],0] plm:base:orted_report_launch from daemon
> [[14523,0],2] on node sunpc1
>>> [linpc1:20650] [[14523,0],0] plm:base:orted_report_launch completed for
> daemon [[14523,0],2] at contact
>>> 951779328.2;tcp://193.174.26.210:33215
>>> [sunpc1:09408] [[14523,0],2] plm:rsh: remote spawn called
>>> [sunpc1:09408] [[14523,0],2] plm:rsh: remote spawn - have no children!
>>> [linpc0:32306] mca: base: components_register: registering plm components
>>> [linpc0:32306] mca: base: components_register: found loaded component rsh
>>> [linpc0:32306] mca: base: components_register: component rsh register
> function successful
>>> [linpc0:32306] mca: base: components_open: opening plm components
>>> [linpc0:32306] mca: base: components_open: found loaded component rsh
>>> [linpc0:32306] mca: base: components_open: component rsh open function
> successful
>>> [linpc0:32306] mca:base:select: Auto-selecting plm components
>>> [linpc0:32306] mca:base:select:( plm) Querying component [rsh]
>>> [linpc0:32306] [[14523,0],1] plm:rsh_lookup on agent ssh : rsh path NULL
>>> [linpc0:32306] mca:base:select:( plm) Query of component [rsh] set priority
> to 10
>>> [linpc0:32306] mca:base:select:( plm) Selected component [rsh]
>>> [linpc0:32306] [[14523,0],1] plm:rsh_setup on agent ssh : rsh path NULL
>>> [linpc0:32306] [[14523,0],1] plm:base:receive start comm
>>> [linpc1:20650] [[14523,0],0] plm:base:orted_report_launch from daemon
> [[14523,0],1]
>>> [linpc1:20650] [[14523,0],0] plm:base:orted_report_launch from daemon
> [[14523,0],1] on node linpc0
>>> [linpc1:20650] [[14523,0],0] RECEIVED TOPOLOGY FROM NODE linpc0
>>> [linpc1:20650] Type: Machine Number of child objects: 2
>>> Name=NULL
>>> total=8387048KB
>>> DMIProductName="Sun Ultra 40 Workstation"
>>> DMIProductVersion=11
>>> DMIBoardVendor="Sun Microsystems"
>>> DMIBoardName="Sun Ultra 40 Workstation"
>>> DMIBoardVersion=50
>>> DMIBoardAssetTag=
>>> DMIChassisVendor="Sun Microsystems"
>>> DMIChassisType=17
>>> DMIChassisVersion=01
>>> DMIChassisAssetTag=
>>> DMIBIOSVendor="Phoenix Technologies Ltd."
>>> DMIBIOSVersion="1.70 "
>>> DMIBIOSDate=02/15/2008
>>> DMISysVendor="Sun Microsystems"
>>> Backend=Linux
>>> OSName=Linux
>>> OSRelease=3.1.10-1.16-desktop
>>> OSVersion="#1 SMP PREEMPT Wed Jun 27 05:21:40 UTC 2012 (d016078)"
>>> Architecture=x86_64
>>> Cpuset: 0x0000000f
>>> Online: 0x0000000f
>>> Allowed: 0x0000000f
>>> Bind CPU proc: TRUE
>>> Bind CPU thread: TRUE
>>> Bind MEM proc: FALSE
>>> Bind MEM thread: TRUE
>>> Type: NUMANode Number of child objects: 2
>>> Name=NULL
>>> local=4192744KB
>>> total=4192744KB
>>> Cpuset: 0x00000003
>>> Online: 0x00000003
>>> Allowed: 0x00000003
>>> Type: Socket Number of child objects: 2
>>> Name=NULL
>>> CPUModel="Dual Core AMD Opteron(tm) Processor 280"
>>> Cpuset: 0x00000003
>>> Online: 0x00000003
>>> Allowed: 0x00000003
>>> Type: L2Cache Number of child objects: 1
>>> Name=NULL
>>> size=1024KB
>>> linesize=64
>>> ways=16
>>> Cpuset: 0x00000001
>>> Online: 0x00000001
>>> Allowed: 0x00000001
>>> Type: L1dCache Number of child objects: 1
>>> Name=NULL
>>> size=64KB
>>> linesize=64
>>> ways=2
>>> Cpuset: 0x00000001
>>> Online: 0x00000001
>>> Allowed: 0x00000001
>>> Type: Core Number of child objects: 1
>>> Name=NULL
>>> Cpuset: 0x00000001
>>> Online: 0x00000001
>>> Allowed: 0x00000001
>>> Type: PU Number of child
> objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000001
>>> Online: 0x00000001
>>> Allowed: 0x00000001
>>> Type: L2Cache Number of child objects: 1
>>> Name=NULL
>>> size=1024KB
>>> linesize=64
>>> ways=16
>>> Cpuset: 0x00000002
>>> Online: 0x00000002
>>> Allowed: 0x00000002
>>> Type: L1dCache Number of child objects: 1
>>> Name=NULL
>>> size=64KB
>>> linesize=64
>>> ways=2
>>> Cpuset: 0x00000002
>>> Online: 0x00000002
>>> Allowed: 0x00000002
>>> Type: Core Number of child objects: 1
>>> Name=NULL
>>> Cpuset: 0x00000002
>>> Online: 0x00000002
>>> Allowed: 0x00000002
>>> Type: PU Number of child
> objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000002
>>> Online: 0x00000002
>>> Allowed: 0x00000002
>>> Type: Bridge Host->PCI Number of child objects: 4
>>> Name=NULL
>>> buses=0000:[00-03]
>>> Type: PCI 10de:0053 Number of child objects: 1
>>> Name=nVidia Corporation CK804 IDE
>>> busid=0000:00:06.0
>>> class=0101(IDE)
>>> PCIVendor="nVidia Corporation"
>>> PCIDevice="CK804 IDE"
>>> Type: Block Number of child objects: 0
>>> Name=sr0
>>> Type: PCI 10de:0055 Number of child objects: 1
>>> Name=nVidia Corporation CK804 Serial ATA
> Controller
>>> busid=0000:00:07.0
>>> class=0101(IDE)
>>> PCIVendor="nVidia Corporation"
>>> PCIDevice="CK804 Serial ATA Controller"
>>> Type: Block Number of child objects: 0
>>> Name=sda
>>> Type: PCI 10de:0054 Number of child objects: 0
>>> Name=nVidia Corporation CK804 Serial ATA
> Controller
>>> busid=0000:00:08.0
>>> class=0101(IDE)
>>> PCIVendor="nVidia Corporation"
>>> PCIDevice="CK804 Serial ATA Controller"
>>> Type: PCI 10de:029d Number of child objects: 2
>>> Name=nVidia Corporation G71GL [Quadro FX
> 3500]
>>> busid=0000:03:00.0
>>> class=0300(VGA)
>>> PCIVendor="nVidia Corporation"
>>> PCIDevice="G71GL [Quadro FX 3500]"
>>> Type: GPU Number of child objects: 0
>>> Name=controlD64
>>> Type: GPU Number of child objects: 0
>>> Name=card0
>>> Type: NUMANode Number of child objects: 2
>>> Name=NULL
>>> local=4194304KB
>>> total=4194304KB
>>> Cpuset: 0x0000000c
>>> Online: 0x0000000c
>>> Allowed: 0x0000000c
>>> Type: Socket Number of child objects: 2
>>> Name=NULL
>>> CPUModel="Dual Core AMD Opteron(tm) Processor 280"
>>> Cpuset: 0x0000000c
>>> Online: 0x0000000c
>>> Allowed: 0x0000000c
>>> Type: L2Cache Number of child objects: 1
>>> Name=NULL
>>> size=1024KB
>>> linesize=64
>>> ways=16
>>> Cpuset: 0x00000004
>>> Online: 0x00000004
>>> Allowed: 0x00000004
>>> Type: L1dCache Number of child objects: 1
>>> Name=NULL
>>> size=64KB
>>> linesize=64
>>> ways=2
>>> Cpuset: 0x00000004
>>> Online: 0x00000004
>>> Allowed: 0x00000004
>>> Type: Core Number of child objects: 1
>>> Name=NULL
>>> Cpuset: 0x00000004
>>> Online: 0x00000004
>>> Allowed: 0x00000004
>>> Type: PU Number of child
> objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000004
>>> Online: 0x00000004
>>> Allowed: 0x00000004
>>> Type: L2Cache Number of child objects: 1
>>> Name=NULL
>>> size=1024KB
>>> linesize=64
>>> ways=16
>>> Cpuset: 0x00000008
>>> Online: 0x00000008
>>> Allowed: 0x00000008
>>> Type: L1dCache Number of child objects: 1
>>> Name=NULL
>>> size=64KB
>>> linesize=64
>>> ways=2
>>> Cpuset: 0x00000008
>>> Online: 0x00000008
>>> Allowed: 0x00000008
>>> Type: Core Number of child objects: 1
>>> Name=NULL
>>> Cpuset: 0x00000008
>>> Online: 0x00000008
>>> Allowed: 0x00000008
>>> Type: PU Number of child
> objects: 0
>>> Name=NULL
>>> Cpuset: 0x00000008
>>> Online: 0x00000008
>>> Allowed: 0x00000008
>>> Type: Bridge Host->PCI Number of child objects: 2
>>> Name=NULL
>>> buses=0000:[80-82]
>>> Type: PCI 10de:0054 Number of child objects: 0
>>> Name=nVidia Corporation CK804 Serial ATA
> Controller
>>> busid=0000:80:07.0
>>> class=0101(IDE)
>>> PCIVendor="nVidia Corporation"
>>> PCIDevice="CK804 Serial ATA Controller"
>>> Type: PCI 10de:0055 Number of child objects: 0
>>> Name=nVidia Corporation CK804 Serial ATA
> Controller
>>> busid=0000:80:08.0
>>> class=0101(IDE)
>>> PCIVendor="nVidia Corporation"
>>> PCIDevice="CK804 Serial ATA Controller"
>>> [linpc1:20650] [[14523,0],0] NEW TOPOLOGY - ADDING
>>> [linpc1:20650] [[14523,0],0] plm:base:orted_report_launch completed for
> daemon [[14523,0],1] at contact
>>> 951779328.1;tcp://193.174.26.214,192.168.1.1:57891
>>> [linpc0:32306] [[14523,0],1] plm:rsh: remote spawn called
>>> [linpc0:32306] [[14523,0],1] plm:rsh: local shell: 2 (tcsh)
>>> [linpc0:32306] [[14523,0],1] plm:rsh: assuming same remote shell as local
> shell
>>> [linpc0:32306] [[14523,0],1] plm:rsh: remote shell: 2 (tcsh)
>>> [linpc0:32306] [[14523,0],1] plm:rsh: final template argv:
>>> /usr/local/bin/ssh <template> orted -mca orte_report_bindings 1 -mca
> ess env -mca orte_ess_jobid 951779328 -mca
>>> orte_ess_vpid <template> -mca orte_ess_num_procs 4 -mca orte_parent_uri
> "951779328.1;tcp://193.174.26.214,192.168.1.1:57891"
>>> -mca orte_hnp_uri "951779328.0;tcp://193.174.26.208:46876" --mca
> plm_base_verbose 100 -mca hwloc_base_report_bindings 1 -mca
>>> orte_display_alloc 1 -mca orte_rankfile rf_linpc_sunpc_tyr -mca plm rsh
>>> [linpc0:32306] [[14523,0],1] plm:rsh: activating launch event
>>> [linpc0:32306] [[14523,0],1] plm:rsh: recording launch of daemon
> [[14523,0],3]
>>> [linpc0:32306] [[14523,0],1] plm:rsh: executing: (/usr/local/bin/ssh)
> [/usr/local/bin/ssh tyr orted -mca orte_report_bindings
>>> 1 -mca ess env -mca orte_ess_jobid 951779328 -mca orte_ess_vpid 3 -mca
> orte_ess_num_procs 4 -mca orte_parent_uri
>>> "951779328.1;tcp://193.174.26.214,192.168.1.1:57891" -mca orte_hnp_uri
> "951779328.0;tcp://193.174.26.208:46876" --mca
>>> plm_base_verbose 100 -mca hwloc_base_report_bindings 1 -mca
> orte_display_alloc 1 -mca orte_rankfile rf_linpc_sunpc_tyr -mca
>>> plm rsh --tree-spawn]
>>> Warning: untrusted X11 forwarding setup failed: xauth key data not generated
>>> Warning: No xauth data; using fake authentication data for X11 forwarding.
>>> [tyr.informatik.hs-fulda.de:23227] mca: base: components_register:
> registering plm components
>>> [tyr.informatik.hs-fulda.de:23227] mca: base: components_register: found
> loaded component rsh
>>> [tyr.informatik.hs-fulda.de:23227] mca: base: components_register: component
> rsh register function successful
>>> [tyr.informatik.hs-fulda.de:23227] mca: base: components_open: opening plm
> components
>>> [tyr.informatik.hs-fulda.de:23227] mca: base: components_open: found loaded
> component rsh
>>> [tyr.informatik.hs-fulda.de:23227] mca: base: components_open: component rsh
> open function successful
>>> [tyr.informatik.hs-fulda.de:23227] mca:base:select: Auto-selecting plm
> components
>>> [tyr.informatik.hs-fulda.de:23227] mca:base:select:( plm) Querying
> component [rsh]
>>> [tyr.informatik.hs-fulda.de:23227] [[14523,0],3] plm:rsh_lookup on agent ssh
> : rsh path NULL
>>> [tyr.informatik.hs-fulda.de:23227] mca:base:select:( plm) Query of
> component [rsh] set priority to 10
>>> [tyr.informatik.hs-fulda.de:23227] mca:base:select:( plm) Selected
> component [rsh]
>>> [tyr.informatik.hs-fulda.de:23227] [[14523,0],3] plm:rsh_setup on agent ssh
> : rsh path NULL
>>> [tyr.informatik.hs-fulda.de:23227] [[14523,0],3] plm:base:receive start comm
>>> [tyr.informatik.hs-fulda.de:23227] [[14523,0],3] plm:base:receive stop comm
>>> [tyr.informatik.hs-fulda.de:23227] mca: base: close: component rsh closed
>>> [tyr.informatik.hs-fulda.de:23227] mca: base: close: unloading component rsh
>>> [linpc0:32306] [[14523,0],1] daemon 3 failed with status 1
>>> [linpc1:20650] [[14523,0],0] plm:base:orted_cmd sending orted_exit commands
>>> [linpc1:20650] [[14523,0],0] plm:base:receive stop comm
>>> [linpc1:20650] mca: base: close: component rsh closed
>>> [linpc1:20650] mca: base: close: unloading component rsh
>>> linpc1 openmpi_1.7.x_or_newer 189 [sunpc1:09408] [[14523,0],2]
> plm:base:receive stop comm
>>> [sunpc1:09408] mca: base: close: component rsh closed
>>> [sunpc1:09408] mca: base: close: unloading component rsh
>>> [linpc0:32306] [[14523,0],1] plm:base:receive stop comm
>>> [linpc0:32306] mca: base: close: component rsh closed
>>> [linpc0:32306] mca: base: close: unloading component rsh
>>>
>>> linpc1 openmpi_1.7.x_or_newer 189
>>>
>>>
>>>
>>> linpc1 openmpi_1.7.x_or_newer 189 mpiexec -report-bindings
> --display-allocation --mca rmaps_base_verbose_100 -np 1 -rf
>>> rf_linpc_sunpc_tyr hostname
>>>
>>> ====================== ALLOCATED NODES ======================
>>> linpc1: slots=1 max_slots=0 slots_inuse=0
>>> =================================================================
>>> --------------------------------------------------------------------------
>>> mpiexec was unable to find the specified executable file, and therefore
>>> did not launch the job. This error was first reported for process
>>> rank 0; it may have occurred for other processes as well.
>>>
>>> NOTE: A common cause for this error is misspelling a mpiexec command
>>> line parameter option (remember that mpiexec interprets the first
>>> unrecognized command line token as the executable).
>>>
>>> Node: linpc1
>>> Executable: 1
>>> --------------------------------------------------------------------------
>>> linpc1 openmpi_1.7.x_or_newer 190
>>>
>>>
>>>
>>>
>>> Kind regards
>>>
>>> Siegmar
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users