Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Michael Kluskens (mklus_at_[hidden])
Date: 2006-10-10 13:55:42


On Oct 6, 2006, at 12:04 AM, Jeff Squyres wrote:

> On 10/5/06 2:42 PM, "Michael Kluskens" <mklus_at_[hidden]> wrote:
>
>> System: BLACS 1.1p3 on Debian Linux 3.1r3 on dual-opteron, gcc 3.3.5,
>> Intel ifort 9.0.32 all tests with 4 processors (comments below)
>
> Good. Can you expand on what you mean by "slowed down"?

Bad interaction between BLACS tester and OpenMPI 1.1.2rc3 (lesser so
with OpenMPI 1.3a1r12069).

The last thing the BLACS tester does is:

The final auxiliary test is for BLACS_ABORT.
Immediately after this message, all processes should be killed.
If processes survive the call, your BLACS_ABORT is incorrect.
{0,2}, pnum=2, Contxt=0, killed other procs, exiting with error #-1.

forrtl: error (78): process killed (SIGTERM)
forrtl: error (78): process killed (SIGTERM)

This test leaves "orted" running only on the second node and using
99% of the CPU. In contrast with OpenMPI 1.3a1r12069 orted is left
running on both nodes but not using cpu time -- this may be perfectly
normal for BLACS_ABORT.

Trying to run either the C or Fortran BLACS tester after the first
run causes the BLACS tester to slow down and possibly freeze up.

The final message with OpenMPI 1.3a1r12069 is:

The final auxiliary test is for BLACS_ABORT.
Immediately after this message, all processes should be killed.
If processes survive the call, your BLACS_ABORT is incorrect.

Michael