Gus Correa <gus_at_[hidden]> writes:
> Or run a serial version on the same set of machines,
> compiled in similar ways (compiler version, opt flags, etc)
> to the parallel versions, and compare results.
> If the results don't differ, then you can start blaming MPI.
That wouldn't show that there's actually any OpenMPI-specific problem,
though -- the parallelism potentially introduces indeterminacy. [I
don't mean to imply Guy thinks otherwise, or that anyone has enough
information to guess what's actually happening.] General discussion of
numerical issues and scientific computing war stories must be way
off-topic here...
|