Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] BLCR + Qlogic infiniband
From: Josh Hursey (jjhursey_at_[hidden])
Date: 2012-11-30 11:41:14

The openib BTL and BLCR support in Open MPI were working about a year ago
(when I last checked). The psm BTL is not supported at the moment though.

>From the error, I suspect that we are not fully closing the openib btl
driver before the checkpoint thus when we try to restart it is looking for
a resource that is no longer present. I created a ticket for us to
investigate further if you want to follow it:

Unfortunately, I do not know who is currently supporting that code path (I
might pick it back up at some point, but cannot promise anything in the
near future). But I will keep an eye on the ticket and see what I can do.
If it is what I think it is, then it should not take too much work to get
it working again.

-- Josh

On Wed, Nov 28, 2012 at 5:14 AM, William Hay <w.hay_at_[hidden]> wrote:

> I'm trying to build openmpi with support for BLCR plus qlogic infiniband
> (plus grid engine). Everything seems to compile OK and checkpoints are
> taken but whenever I try to restore a checkpoint I get the following error:
> - do_mmap(<file>, 00002aaab18c7000, 0000000000001000, ...) failed:
> ffffffffffffffea
> - mmap failed: /dev/ipath
> - thaw_threads returned error, aborting. -22
> - thaw_threads returned error, aborting. -22
> Restart failed: Invalid argument
> This occurs whether I specify psm or openib as the btl.
> This looks like the sort of thing I would expect to be handled by the blcr
> supporting code in openmpi. So I guess I have a couple ofquestions.
> 1)Are Infiniband and BLCR support in openmpi compatible?
> 2)Are there any special tricks necessary to get them working together.
> _______________________________________________
> users mailing list
> users_at_[hidden]

Joshua Hursey
Assistant Professor of Computer Science
University of Wisconsin-La Crosse