Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Specifying slots in rankfile
From: Grzegorz Maj (maju3_at_[hidden])
Date: 2010-06-09 06:39:09


In my previous mail I said that slot=0-3 would be a solution.
Unfortunately it gives me exactly the same segfault as in case with
*:*

2010/6/9 Grzegorz Maj <maju3_at_[hidden]>:
> Hi,
> I'd like mpirun to run tasks with specific ranks on specific hosts,
> but I don't want to provide any particular sockets/slots/cores.
> The following example uses just one host, but generally I'll use more.
> In my hostfile I just have:
>
> root_at_host01 slots=4
>
> I was playing with my rankfile to achieve what I've mentioned, but I
> always get some problems.
>
> 1) With rankfile like:
> rank 0=host01 slot=*
> rank 1=host01 slot=*
> rank 2=host01 slot=*
> rank 3=host01 slot=*
>
> I get:
>
> --------------------------------------------------------------------------
> We were unable to successfully process/set the requested processor
> affinity settings:
>
> Specified slot list: *
> Error: Error
>
> This could mean that a non-existent processor was specified, or
> that the specification had improper syntax.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> mpirun was unable to start the specified application as it encountered an error:
>
> Error name: Error
> Node: host01
>
> when attempting to start process rank 0.
> --------------------------------------------------------------------------
> [host01:13715] Rank 0: PAFFINITY cannot get physical processor id for
> logical processor 4
>
>
> I think it tries to find processor #4, bug there are only 0-3
>
> 2) With rankfile like:
> rank 0=host01 slot=*:*
> rank 1=host01 slot=*:*
> rank 2=host01 slot=*:*
> rank 3=host01 slot=*:*
>
> Everything looks well, i.e. my programs are spread across 4 processors.
> But when running MPI program as follows:
>
> MPI::Init(argc, argv);
> fprintf(stderr, "after init %d\n", MPI::Is_initialized());
> nprocs_mpi = MPI::COMM_WORLD.Get_size();
> fprintf(stderr, "won't get here\n");
>
> I get:
>
> after init 1
> [host01:14348] *** Process received signal ***
> [host01:14348] Signal: Segmentation fault (11)
> [host01:14348] Signal code: Address not mapped (1)
> [host01:14348] Failing at address: 0x8
> [host01:14348] [ 0] [0xffffe410]
> [host01:14348] [ 1] p(_ZNK3MPI4Comm8Get_sizeEv+0x19) [0x8051299]
> [host01:14348] [ 2] p(main+0x86) [0x804ee4e]
> [host01:14348] [ 3] /lib/libc.so.6(__libc_start_main+0xe5) [0x4180b5c5]
> [host01:14348] [ 4] p(__gxx_personality_v0+0x125) [0x804ecc1]
> [host01:14348] *** End of error message ***
>
> I'm using OPEN MPI v. 1.4.2 (downloaded yesterday).
> In my rankfile I really want to write something like slot=*. I know
> slot=0-3 would be a solution, but when generating rankfile I may not
> be sure how many processors are there available on a particular host.
>
> Any help would be appreciated.
>
> Regards,
> Grzegorz Maj
>