Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] possible bug in 1.3.2 sm transport
From: Bryan Lally (lally_at_[hidden])
Date: 2009-05-19 17:20:27


Jeff Squyres wrote:
> On May 18, 2009, at 11:49 PM, Bryan Lally wrote:
>
>> Ralph sent me a platform file and a corresponding .conf file. I built
>> ompi from openmpi-1.3.3a1r21223.tar.gz, with these files. I've been
>> running my normal tests and have been unable to hang a job yet. I've
>> run enough that I don't expect to see a problem.
>>
>
>
> That's both good and bad. :-)

Right!

> Can you point out specifically which platform file is being used? If
> that platform file is changing something from "not working" to
> "working", it bears a bit closer examination to ensure that we aren't
> just masking a bug.

Here's what we've found. It wasn't the platform file as such. I've
since built with ./configure and some standard, obvious command line
switches. What's then required is to edit the platform configuration
file, <prefix>/etc/openmpi-mca-params.conf and add:

        coll_sync_priority = 100
        coll_sync_barrier_before = 1000

-- 
Bryan Lally, lally_at_[hidden]
505.667.9954
CCS-2
Los Alamos National Laboratory
Los Alamos, New Mexico