Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] spin-wait backoff
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2010-09-03 08:05:45


On Sep 3, 2010, at 12:16 AM, Ralph Castain wrote:

> Backing off the polling rate requires more application-specific logic like that offered below, so it is a little difficult for us to implement at the MPI library level. Not saying we eventually won't - just not sure anyone quite knows how to do so in a generalized form.

FWIW, we've *talked* about this kind of stuff among the developers -- it's at least somewhat similar to the "backoff to blocking communications instead of polling communications" issues. That work in particular has been discussed for a long time but never implemented.

Are your jobs hanging because of deadlock (i.e., application error), or infrastructure error? If they're hanging because of deadlock, there are some PMPI-based tools that might be able to help.

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/