Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Galen M. Shipman (gshipman_at_[hidden])
Date: 2005-10-25 11:13:13

Hi Troy,

Sorry for the delay, I am now able to reproduce this behavior when I
do not specify HPL_NO_DATATYPE. If I do specify HPL_NO_DATATYPE the
run completes. We will be looking into this now.



On Oct 21, 2005, at 5:03 PM, Troy Telford wrote:

> I've been trying out the RC4 builds of OpenMPI; I've been using
> Myrinet
> (gm), Infiniband (mvapi), and TCP.
> When running a benchmark such as IMB (formerly PALLAS, IIRC), or
> even a
> simple hello world, there are no problems.
> However, when running HPL (and HPCC, which is a superset of HPL), I
> have
> run into a problem: When running HPL (or when the execution
> reaches the
> HPL portion of HPCC), the process seems to get wedged...
> I have no problems compiling and building HPL and HPCC for MPICH
> variants
> ( including MVAPICH, MPICH-GM/MX) and LAM; no problems with the gcc,
> Intel, PGI, or Pathscale compilers.
> The HPL.dat (and hpccinf.txt) can be identical across the
> machines. The
> machines are identically configured (except for the interconnect).
> However, when running the HPL code (on OpenMPI), HPL will peg the
> CPUs,
> and run until I feel like killing it.. If the 'N' size is larger
> than a
> fraction of a percent of free system memory (0.1% of free memory;
> system
> has 2 GB/CPU, in my case), HPL and HPCC will not finish computing that
> problem size. (Case in point -- a N size that is small enough that it
> takes 1-2 seconds with MPICH, MPICH-GM, MVAPICH, or LAM -- doesn't
> complete after several minutes on OpenMPI)
> I'm therefore, somewhat confused; I've seen posts from people who
> claim to
> have run HPL with OpenMPI. I've had no issues running other
> benchmarks on
> OpenMPI; but HPL-based code seems to wedge itself... The behavior is
> consistent when I use Myrinet, Infiniband, or Ethernet.
> I am running OpenMPI on Linux (SuSE Enterprise 9, SP2, x86_64).
> Dual-Opteron 248; 2 GB/CPU
> _______________________________________________
> users mailing list
> users_at_[hidden]