Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] mpirun only works when -np <4 (Gus Correa)
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2009-12-10 17:20:12


On Dec 10, 2009, at 5:01 PM, Gus Correa wrote:

> > Just a quick interjection, I also have a dual-quad Nehalem system, HT
> > on, 24GB ram, hand compiled 1.3.4 with options: --enable-mpi-threads
> > --enable-mpi-f77=no --with-openib=no
> >
> > With v1.3.4 I see roughly the same behavior, hello, ring work,
> > connectivity fails randomly with np >= 8. Turning on -v increased the
> > success, but still hangs. np = 16 fails more often, and the hang is
> > random in which pair of processes are communicating.
> >
> > However, it seems to be related to the shared memory layer problem.
> > Running with -mca btl ^sm works consistently through np = 128.

Note, too, that --enable-mpi-threads "works" but I would not say that it is production-quality hardened yet. IBM is looking into thread safety issues to harden up this code. If the same hangs can be observed without --enable-mpi-threads, that would be a good data point.

-- 
Jeff Squyres
jsquyres_at_[hidden]