Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Infinipath context limit
From: Christian Bell (christian.bell_at_[hidden])
Date: 2008-02-06 14:34:00


Hi Daniel --

  PSM should determine your node setup and enable shared contexts
  accordingly, but it looks like something isn't working right. You
  can apply the patch I've attached to this e-mail and things should
  work again.
  
  However, it would be useful to identify what's going wrong. Can
  you compile a hello world program and run it with the machinefile
  you're trying to use. Send me the output from:

  mpirun -machinefile .... env PSM_TRACEMASK=0x101 ./hello_world

  I understand your failure mode only if somehow the 8-core node is
  detected to be a 4-core node. The output should tell us this.

  cheers,

    . . christian
  

On Wed, 06 Feb 2008, Dani?l Mantione wrote:

> Hello,
>
> I am trying to use OpenMPI on a cluster with Infinipath and 8 core nodes.
> I get these errors when using more than 4 processes:
>
> node017.13311ipath_userinit: assign_port command failed: Device or
> resource busy
> [node017:13311] Open MPI failed to open a PSM endpoint: No free InfiniPath
> contexts available on /dev/ipath
> [node017:13311] Error in psm_ep_open (error No free ports could be
> obtained)
> node017.13315ipath_userinit: assign_port command failed: Device or
> resource busy
> [node017:13315] Open MPI failed to open a PSM endpoint: No free InfiniPath
> contexts available on /dev/ipath
> [node017:13315] Error in psm_ep_open (error No free ports could be
> obtained)
> node017.13314ipath_userinit: assign_port command failed: Device or
> resource busy
> node017.13313ipath_userinit: assign_port command failed: Device or
> resource busy
> [node017:13313] Open MPI failed to open a PSM endpoint: No free InfiniPath
> contexts available on /dev/ipath
> [node017:13313] Error in psm_ep_open (error No free ports could be
> obtained)
> [node017:13314] Open MPI failed to open a PSM endpoint: No free InfiniPath
> contexts available on /dev/ipath
> [node017:13314] Error in psm_ep_open (error No free ports could be
> obtained)
>
> The Infinipath User Guide writes this:
>
> "Context Sharing Enabled: The MPI library provides PSM the local process layout
> so that InfiniPath contexts available on each node can be shared if necessary; for
> example, when running more node programs than contexts. By default, the
> QLE7140 and QHT7140 have a maximum of four and eight sharable InfiniPath
> contexts, respectively. Up to 4 node programs (from the same MPI job) can share
> an InfiniPath context, for a total of 16 node programs per node for each QLE7140
> and 32 node programs per node for each QHT7140.
> The error message when this limit is exceeded is:
>
> No free InfiniPath contexts available on /dev/ipath
> "
>
> It looks like OpenMPI is running into the context limit, apparently 4
> inthis case. Can I do the context sharing mentioned with OpenMPI?
>
> Best regards,
>
> Daniël Mantione
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
christian.bell_at_[hidden]
(QLogic Host Solutions Group, formerly Pathscale)