Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] BLCR + Qlogic infiniband
From: Josh Hursey (jjhursey_at_[hidden])
Date: 2012-11-30 11:41:14


The openib BTL and BLCR support in Open MPI were working about a year ago
(when I last checked). The psm BTL is not supported at the moment though.

>From the error, I suspect that we are not fully closing the openib btl
driver before the checkpoint thus when we try to restart it is looking for
a resource that is no longer present. I created a ticket for us to
investigate further if you want to follow it:
  https://svn.open-mpi.org/trac/ompi/ticket/3417

Unfortunately, I do not know who is currently supporting that code path (I
might pick it back up at some point, but cannot promise anything in the
near future). But I will keep an eye on the ticket and see what I can do.
If it is what I think it is, then it should not take too much work to get
it working again.

-- Josh

On Wed, Nov 28, 2012 at 5:14 AM, William Hay <w.hay_at_[hidden]> wrote:

> I'm trying to build openmpi with support for BLCR plus qlogic infiniband
> (plus grid engine). Everything seems to compile OK and checkpoints are
> taken but whenever I try to restore a checkpoint I get the following error:
> - do_mmap(<file>, 00002aaab18c7000, 0000000000001000, ...) failed:
> ffffffffffffffea
> - mmap failed: /dev/ipath
> - thaw_threads returned error, aborting. -22
> - thaw_threads returned error, aborting. -22
> Restart failed: Invalid argument
>
> This occurs whether I specify psm or openib as the btl.
>
> This looks like the sort of thing I would expect to be handled by the blcr
> supporting code in openmpi. So I guess I have a couple ofquestions.
> 1)Are Infiniband and BLCR support in openmpi compatible?
> 2)Are there any special tricks necessary to get them working together.
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

-- 
Joshua Hursey
Assistant Professor of Computer Science
University of Wisconsin-La Crosse
http://cs.uwlax.edu/~jjhursey