Sorry Brian and Jeff - I sent you chasing after something of a red herring...
After much more testing and banging my head on the desk trying to figure this one out, it turns out '--mca mpi_yield_when_idle 1' on the command line does actually work properly for me... The one or two times I had previously tried using the command line argument, my app (by unfortunate coincidence - it took me a long time to figure this one out) happened to run slowly for completely unrelated reasons.
However, instead of typing the command line argument each time, for the bulk of my testing I was instead putting 'mpi_yield_when_idle = 1' in /usr/local/etc/openmpi-mca-params.conf on the machine I ran 'mpirun' from. I didn't update that file on each of my worker nodes - only on the node i was running 'mpirun' from. I had assumed that this would have the same effect as typing '--mca mpi_yield_when_idle 1' on the command line - mpirun would read /usr/local/etc/openmpi-mca-params.conf, import all of the parameters, then propagate those parameters to the worker nodes as if the parameters were typed on the command line. Apparently, in reality, orted reads /usr/local/etc/openmpi-mca-params.conf on the local node where orted is actually running, and entries in the file on the node where 'mpirun' is run are not propagated. Is this a bug or an undocumented feature? ;)
Sorry to have wasted your time chasing the wrong problem...
On Fri, May 26, 2006 at 01:09:22PM -0400, Brian W. Barrett wrote:
> On Fri, 26 May 2006, Brian W. Barrett wrote:
> > On Fri, 26 May 2006, Jeff Squyres (jsquyres) wrote:
> >> You can see this by slightly modifying your test command -- run "env"
> >> instead of "hostname". You'll see that the environment variable
> >> OMPI_MCA_mpi_yield_when_idle is set to the value that you passed in on
> >> the mpirun command line, regardless of a) whether you're oversubscribing
> >> or not, and b) whatever is passed in through the orted.
> > While Jeff is correct that the parameter informing the MPI process that it
> > should idle when it's not busy is correctly set, it turns out that we are
> > ignoring this parameter inside the MPI process. I'm looking into this and
> > hope to have a fix this afternoon.
> Mea culpa. Jeff's right that in a normal application, we are setting up
> to call sched_yield() when idle if the user sets mpi_yield_when_idle to 1,
> regardless of what is in the hostfile . The problem with my test case was
> that for various reasons, my test code was never actually "idling" - there
> were always things moving along, so our progress engine was deciding that
> the process should not be idled.
> Can you share your test code at all? I'm wondering if something similar
> is happening with your code. It doesn't sound like it should be "always
> working", but I'm wondering if you're triggering some corner case we
> haven't thought of.
> Brian Barrett
> Graduate Student, Open Systems Lab, Indiana University
> devel mailing list