Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] problem when mpi_paffinity_alone is set to 1
From: Camille Coti (coti_at_[hidden])
Date: 2008-08-22 09:21:19


OK, thank you!

Camille

Ralph Castain a écrit :
> Okay, I'll look into it. I suspect the problem is due to the
> redefinition of the paffinity API to clarify physical vs logical
> processors - more than likely, the maffinity interface suffers from the
> same problem we had to correct over there.
>
> We'll report back later with an estimate of how quickly this can be fixed.
>
> Thanks
> Ralph
>
> On Aug 22, 2008, at 7:03 AM, Camille Coti wrote:
>
>>
>> Ralph,
>>
>> I compiled a clean checkout from the trunk (r19392), the problem is
>> still the same.
>>
>> Camille
>>
>>
>> Ralph Castain a écrit :
>>> Hi Camille
>>> What OMPI version are you using? We just changed the paffinity module
>>> last night, but did nothing to maffinity. However, it is possible
>>> that the maffinity framework makes some calls into paffinity that
>>> need to adjust.
>>> So version number would help a great deal in this case.
>>> Thanks
>>> Ralph
>>> On Aug 22, 2008, at 5:23 AM, Camille Coti wrote:
>>>> Hello,
>>>>
>>>> I am trying to run applications on a shared-memory machine. For the
>>>> moment I am just trying to run tests on point-to-point
>>>> communications (a trivial token ring) and collective operations
>>>> (from the SkaMPI tests suite).
>>>>
>>>> It runs smoothly if mpi_paffinity_alone is set to 0. For a number of
>>>> processes which is larger than about 10, global communications just
>>>> don't seem possible. Point-to-point communications seem to be OK.
>>>>
>>>> But when I specify --mca mpi_paffinity_alone 1 in my command line,
>>>> I get the following error:
>>>>
>>>> mbind: Invalid argument
>>>>
>>>> I looked into the code of maffinity/libnuma, and found out the error
>>>> comes from
>>>>
>>>> numa_setlocal_memory(segments[i].mbs_start_addr,
>>>> segments[i].mbs_len);
>>>>
>>>> in maffinity_libnuma_module.c.
>>>>
>>>> The machine I am using is a Linux box running a 2.6.5-7 kernel.
>>>>
>>>> Has anyone experienced a similar problem?
>>>>
>>>> Camille
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users