Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] problems with rankfile in openmpi-1.9a1r29097
From: Siegmar Gross (Siegmar.Gross_at_[hidden])
Date: 2013-09-03 04:20:31


Hi,

> 3) I have a problem on "tyr" (Solaris 10 sparc).
>
> tyr rankfiles 106 mpiexec -report-bindings \
> -rf rf_tyr_semicolon -np 1 hostname
> [tyr.informatik.hs-fulda.de:29849] [[53951,0],0] ORTE_ERROR_LOG:
> Not found in file
> ../../../../../openmpi-1.9a1r29097/orte/mca/rmaps/rank_file/rmaps_rank_file.c
> at line 276
> [tyr.informatik.hs-fulda.de:29849] [[53951,0],0] ORTE_ERROR_LOG:
> Not found in file
> ../../../../openmpi-1.9a1r29097/orte/mca/rmaps/base/rmaps_base_map_job.c
> at line 173
> tyr rankfiles 107

This one works now. I found a strange character in the rankfile, which
I removed.

tyr rankfiles 103 mpiexec -report-bindings \
  -rf rf_tyr_semicolon -np 1 hostname
[tyr.informatik.hs-fulda.de:00079] MCW rank 0 bound to
  socket 0[core 0[hwt 0]], socket 1[core 1[hwt 0]]: [B][B]
tyr.informatik.hs-fulda.de

> I get the following output, if I try the rankfile from a different machine
> (also Solaris 10 sparc).
>
> rs0 rankfiles 104 mpiexec -report-bindings -rf rf_tyr_semicolon -np 1 hostname
> --------------------------------------------------------------------------
> All nodes which are allocated for this job are already filled.
> --------------------------------------------------------------------------
> rs0 rankfiles 105

No change in this case.

rs0 rankfiles 102 mpiexec -report-bindings \
  -rf rf_tyr_semicolon -np 1 hostname
--------------------------------------------------------------------------
All nodes which are allocated for this job are already filled.
--------------------------------------------------------------------------

I checked the other rankfiles as well and they are OK, so that
problems 1) and 2) of my previous e-mail still exist.

Kind regards

Siegmar