Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] problem with rankfile in openmpi-1.6.4rc2
From: Ralph Castain (rhc_at_[hidden])
Date: 2013-01-24 22:50:39


I built the current 1.6 branch (which hasn't seen any changes that would impact this function) and was able to execute it just fine on a single socket machine. I then gave it your slot-list, which of course failed as I don't have two active sockets (one is empty), but it appeared to parse the list just fine.

>From what I can tell, it looks like your linpc1 is unable to detect a second socket for some reason when given the slot_list argument. I'll have to try again tomorrow when I have access to a dual-socket machine.

On Jan 19, 2013, at 1:45 AM, Siegmar Gross <Siegmar.Gross_at_[hidden]> wrote:

> Hi
>
> I have installed openmpi-1.6.4rc2 and have still a problem with my
> rankfile.
>
> linpc1 rankfiles 113 ompi_info | grep "Open MPI:"
> Open MPI: 1.6.4rc2r27861
>
> linpc1 rankfiles 114 cat rf_linpc1
> rank 0=linpc1 slot=0:0-1,1:0-1
>
> linpc1 rankfiles 115 mpiexec -report-bindings -np 1 \
> -rf rf_linpc1 hostname
> --------------------------------------------------------------------
> We were unable to successfully process/set the requested processor
> affinity settings:
>
> Specified slot list: 0:0-1,1:0-1
> Error: Error
>
> This could mean that a non-existent processor was specified, or
> that the specification had improper syntax.
> --------------------------------------------------------------------
> --------------------------------------------------------------------
> mpiexec was unable to start the specified application as it
> encountered an error:
>
> Error name: Error
> Node: linpc1
>
> when attempting to start process rank 0.
> --------------------------------------------------------------------
>
>
> Everything works fine with the following command.
>
> linpc1 rankfiles 116 mpiexec -report-bindings -np 1 -cpus-per-proc 4 \
> -bycore -bind-to-core hostname
> [linpc1:20140] MCW rank 0 bound to socket 0[core 0-1]
> socket 1[core 0-1]: [B B][B B]
> linpc1
>
>
> I would be grateful if somebody could fix the problem. Thank you very
> much for any help in advance.
>
>
> Kind regards
>
> Siegmar
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users