I grabbed the new OMPI 1.6.1 and ran my test that would cause a hang with 1.6.0 with low registered memory. From reading the release notes rather than hang I would expect:
* lower performance/fall back to send/receive.
* a notice of failed to allocate registered memory
In my case I still get a hang, is this expected? This is running with default registered memory limits and I do appreciate the message that I only have 4GB of registered memory of my 48. We will also be fixing our load to raise this value, which should make this issue moot.
Honestly I think what I would want is for MPI to blow up saying "can't allocate registered memory, fatal, contact your admin", rather than fall back to send/receive and just be slower.
Am I reading the release notes correctly? Is there a tunable setting to blow up rather than fallback?
Brock Palen
www.umich.edu/~brockp
CAEN Advanced Computing
brockp_at_[hidden]
(734)936-1985
|