Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] OFED question
From: Paul H. Hargrove (PHHargrove_at_[hidden])
Date: 2011-01-27 18:00:38


Brian,

   The ability to control the number of available QPs will vary by
vendor. Unless things have changed in recent years, Mellanox's firmware
tools allow one the modify the limit but at the inconvenience of
reburning the firmware. I know of no other way and know nothing about
other vendors.

-Paul

On 1/27/2011 2:56 PM, Barrett, Brian W wrote:
> All -
>
> On one of our clusters, we're seeing the following on one of our applications, I believe using Open MPI 1.4.3:
>
> [xxx:27545] *** An error occurred in MPI_Scatterv
> [xxx:27545] *** on communicator MPI COMMUNICATOR 5 DUP FROM 4
> [xxx:27545] *** MPI_ERR_OTHER: known error not in list
> [xxx:27545] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
> [xxx][[31806,1],0][connect/btl_openib_connect_oob.c:857:qp_create_one] error creating qp errno says Resource temporarily unavailable
> --------------------------------------------------------------------------
> mpirun has exited due to process rank 0 with PID 27545 on
> node rs1891 exiting without calling "finalize". This may
> have caused other processes in the application to be
> terminated by signals sent by mpirun (as reported here).
> --------------------------------------------------------------------------
>
>
> The problem goes away if we modify the eager protocol msg sizes so that there are only two QPs necessary instead of the default 4. Is there a way to bump up the number of QPs that can be created on a node, assuming the issue is just running out of available QPs? If not, any other thoughts on working around the problem?
>
> Thanks,
>
> Brian
>
> --
> Brian W. Barrett
> Dept. 1423: Scalable System Software
> Sandia National Laboratories
>
>
>
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
Paul H. Hargrove                          PHHargrove_at_[hidden]
Future Technologies Group
HPC Research Department                   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900