Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] ScaLAPACK and OpenMPI > 1.3.1
From: Åke Sandgren (ake.sandgren_at_[hidden])
Date: 2010-01-21 16:23:28


On Thu, 2010-01-21 at 14:48 -0600, Champagne, Nathan J. (JSC-EV)[Jacobs
Technology] wrote:
> We started having a problem with OpenMPI beginning with version 1.3.2
> where the program output can be correct, junk, or NaNs (result is not
> predictable). The output is the solution of a matrix equation solved
> by ScaLAPACK. We are using the Intel Fortran compiler (version 11.1)
> and the GNU compiler (version 4.1.2) on Gentoo Linux. So far, the
> problem manifests itself for a matrix (N X N) with N ~ 10,000 or more
> with a processor count ~ 64 or more. Note that the problem still
> occurs using OpenMPI 1.4.1.
>
>
>
> We build the ScaLAPACK and BLACS libraries locally and use the LAPACK
> and BLAS libraries supplied by Intel.
>
>
>
> We wrote a test program to demonstrate the problem. The matrix is
> built on each processor (no communication). Then, the matrix is
> factored and solved. The solution vector is collected from the
> processors and printed to a file by the master processor. The program
> and associated OpenMPI information (ompi_info --all) are available at:
>
>
>
> http://www.em-stuff.com/files/files.tar.gz
>
>
>
> The file "compile" in the "test" directory is used to create the
> executable. Edit it to reflect libraries on your local machine. Data
> created using OpenMPI 1.3.1 and 1.4.1 are in the "output" directory
> for reference.

What is a correct result then?
Hard to test without knowing.

How often do you get junk or NaNs compared to correct result.

-- 
Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
Internet: ake_at_[hidden]   Phone: +46 90 7866134 Fax: +46 90 7866126
Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se