Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Interaction between Intel and OpenMPI floating point exceptions
From: Iain Bason (Iain.Bason_at_[hidden])
Date: 2009-04-07 10:58:24


On Apr 6, 2009, at 7:22 PM, Steve Lowder wrote:

> Recently I've been running an MPI code that uses the LAPACK slamch
> routine to determine machine precision parameters. This software is
> compiled using the latest Intel Fortran compiler and setting the -
> fpe0 argument to watch for certain floating point errors. The
> slamch routines crashed and printed an OpenMPI stacktrace to report
> an underflow error, however the Intel -fpe0 setting doesn't abort on
> underflow. When this software is not compiled and linked with
> OpenMPI, it ignores the underflow and doesn't abort when compiled
> with -fpe0.
>
> When I run the MPI version and set --mca opal_signal 6,7,11 the code
> doesn't abort on underflow. I'd like to know if I'm interpreting
> this behavior correctly, it appears that the mpi versus no mpi cases
> handle underflow differently. I'm assuming OpenMPI has a handler
> that processes the interrupts ahead of the Fortran RTL, stopping
> execution. Otherwise the Fortran RTL handler would just ignore the
> underflow. Do I sort of understand what is going on here? Is there
> another solution short of the --mca opal_signal switch?

Your analysis sounds about right to me. There are Fortran intrinsic
routines that can get those machine precision parameters instead of
slamch. Would it be feasible to modify the code to use them?

Iain