Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] perhaps an openmpi bug, how best to identify?
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2010-07-14 15:21:07


On Jul 12, 2010, at 11:14 AM, Olivier Marsden wrote:

> Hi again,
> after testing as suggested, it is indeed a massive slowdown rather than
> a full-blown machine hang.

Ok.

> Would the next test be to run with debug flags for openmpi ?

You might want to run with

   mpirun --mca mpi_yield_when_idle 1 ...

This will tell the OMPI processing core to call sched_yield() when it's polling for progress (rather than spinning hard, polling for new messages, etc.).

You also mentioned that you're running 7 MPI processes. How many processors does your workstation have? If you have less than 7, then that could explain what you're seeing. If all the MPI processes are aggressively polling for progress, it could bring the machine to a crawl.

That being said, Open MPI *should* auto-detect that it is oversubscribing the machine (i.e., that it's running more processes than available processors) and automatically set mpi_yield_when_idle to 1 by itself. Perhaps the auto-detection is broken somehow...?

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/