Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] OpenMPI, PLPA and Linux cpuset/cgroup support
From: Ralph Castain (rhc_at_[hidden])
Date: 2009-07-15 22:19:07


Looking at your command line, did you remember to set -mca
mpi_paffinity_alone 1? If not, we won't set affinity on the processes.

On Jul 15, 2009, at 8:11 PM, Chris Samuel wrote:

>
> ----- "Ralph Castain" <rhc_at_[hidden]> wrote:
>
>> Could you check this? You can run a trivial job using the -npernode x
>> option, where x matched the #cores you were allocated on the nodes.
>> If you do this, do we bind to the correct cores?
>
> Nope, I'm afraid it doesn't - submitted a job asking
> for 4 cores on one node and was allocated cores 0-3 in
> the cpuset.
>
> Grep'ing the strace output for anything mentioning affinity shows:
>
> [csamuel_at_tango027 CPI]$ fgrep affinity cpi-trace.txt
> 11412 execve("/usr/local/openmpi/1.3.3-gcc/bin/mpiexec", ["mpiexec",
> "--mca", "paffinity", "linux", "-npernode", "4", "/home/csamuel/
> Sources/Tests/CPI/"...], [/* 56 vars */]) = 0
> 11412 sched_getaffinity(0, 128, { f }) = 8
> 11412 sched_setaffinity(0, 8, { 0 }) = -1 EFAULT (Bad address)
> 11416 sched_getaffinity(0, 128, <unfinished ...>
> 11416 <... sched_getaffinity resumed> { f }) = 8
> 11416 sched_setaffinity(0, 8, { 0 } <unfinished ...>
> 11416 <... sched_setaffinity resumed> ) = -1 EFAULT (Bad address)
> 11414 sched_getaffinity(0, 128, <unfinished ...>
> 11414 <... sched_getaffinity resumed> { f }) = 8
> 11414 sched_setaffinity(0, 8, { 0 } <unfinished ...>
> 11414 <... sched_setaffinity resumed> ) = -1 EFAULT (Bad address)
> 11413 sched_getaffinity(0, 128, <unfinished ...>
> 11413 <... sched_getaffinity resumed> { f }) = 8
> 11413 sched_setaffinity(0, 8, { 0 } <unfinished ...>
> 11413 <... sched_setaffinity resumed> ) = -1 EFAULT (Bad address)
> 11415 sched_getaffinity(0, 128, <unfinished ...>
> 11415 <... sched_getaffinity resumed> { f }) = 8
> 11415 sched_setaffinity(0, 8, { 0 } <unfinished ...>
> 11415 <... sched_setaffinity resumed> ) = -1 EFAULT (Bad address)
> 11413 sched_getaffinity(11413, 8, <unfinished ...>
> 11415 sched_getaffinity(11415, 8, <unfinished ...>
> 11413 <... sched_getaffinity resumed> { f }) = 8
> 11415 <... sched_getaffinity resumed> { f }) = 8
> 11414 sched_getaffinity(11414, 8, <unfinished ...>
> 11414 <... sched_getaffinity resumed> { f }) = 8
> 11416 sched_getaffinity(11416, 8, <unfinished ...>
> 11416 <... sched_getaffinity resumed> { f }) = 8
>
> I can confirm that it's not worked by checking what
> plpa-taskset says about a process (for example 11414):
>
> [root_at_tango027 plpa-taskset]# ./plpa-taskset -cp 11414
> pid 11414's current affinity list: 0-3
>
> According to the manual page:
>
> EFAULT A supplied memory address was invalid.
>
> This is on a dual socket quad core AMD Shanghai system
> running the 2.6.28.9 kernel (not had a chance to upgrade
> recently).
>
> Will do some more poking around after lunch.
>
> cheers,
> Chris
> --
> Christopher Samuel - (03) 9925 4751 - Systems Manager
> The Victorian Partnership for Advanced Computing
> P.O. Box 201, Carlton South, VIC 3053, Australia
> VPAC is a not-for-profit Registered Research Agency
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel