They symptom is that the process hangs forever. Its difficult to differentiate this bug and simply running out of registered memory.
The bug is hit if the pml is using the mpi_leave_pinned protocol and the btl returns an error from its send function.
From: devel-bounces_at_[hidden] [devel-bounces_at_[hidden]] on behalf of Christopher Samuel [samuel_at_[hidden]]
Sent: Thursday, March 01, 2012 7:58 PM
Subject: Re: [OMPI devel] [OMPI svn] svn:open-mpi r26077 (fwd)
-----BEGIN PGP SIGNED MESSAGE-----
On 02/03/12 02:56, Nathan Hjelm wrote:
> Found a pretty nasty frag leak (and a minor one) in ob1 (see
> commit below). If this fix addresses some hangs we are seeing on
> infiniband LANL might want a 1.4.6 rolled (or a faster rollout for
What symptoms would an affected job show? Does it fail with an OMPI
error or does it just hang using 0% CPU?
Christopher Samuel - Senior Systems Administrator
VLSCI - Victorian Life Sciences Computation Initiative
Email: samuel_at_[hidden] Phone: +61 (0)3 903 55545
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
-----END PGP SIGNATURE-----
devel mailing list