Open MPI logo

PLPA Users' Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all PLPA Users mailing list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2005-12-16 17:35:33


On Dec 16, 2005, at 10:51 AM, Bogdan Costescu wrote:

> I have some comments regarding version 0.9a1:

Excellent. :-)

> 1. In api_probe(), there is a _get call, then a _set call. I think
> that this opens up a window of opportunity for a race, given that the
> affinity can be changed by another process with same EUID or
> CAP_SYS_NICE. A typical PLPA usage would involve:
>
> api_probe()
> _get
> <- _set from other PID
> _set
> _get
>
> (just to be clear: these are syscalls and not plpa_... functions)
> The second _get call would get the "wrong" result; this would be the
> value that was obtained in the first _get and not the value that was
> set from the other process. Given that there is no atomic
> _get_and_set, I don't think that there is any way to prevent this if
> the validation via _set is desired. A situation like this could
> conceivably appear in a batch environment that knows about affinity
> and sets it by itself for the launched processes.
>
> I don't know what is the current kernel status regarding on-the-fly
> adding/removing CPUs and maybe cpusets, but they would be represented
> by the kernel variable "cpu_online_map"; this is checked when calling
> _set and might return -EINVAL, which is the same error code returned
> for wrong length - so if "cpu_online_map" changes as well between the
> _get and _set, the return value might be wrongly interpreted to mean
> "bad length".

Excellent point.

Hmm. Is there anything we can do about this? Should we just
document this behavior?

> 2. The README file suggests that PLPA detects failures in the glibc
> implementations - I think that this doesn't reflect the current
> version which uses syscalls and doesn't do any glibc call. There is
> also a paragraph in "How do I use PLPA?" section that starts with
> "These functions perform the run-time test..." which I think is also
> obsolete.

Doh! I went through and revamped README and thought I removed all
the glibc stuff -- I'll go ditch that, too.

> 3. The README file and source comments contain some typos - patch
> attached. Please apply the patch before making any modifications
> suggested by the comment #2.

W00t. Applied and committed; thanks!

--
{+} Jeff Squyres
{+} The Open MPI Project
{+} http://www.open-mpi.org/