Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Openib with > 32 cores per node
From: Robert Horton (r.horton_at_[hidden])
Date: 2011-05-19 11:37:56


On Thu, 2011-05-19 at 08:27 -0600, Samuel K. Gutierrez wrote:
> Hi,
>
> Try the following QP parameters that only use shared receive queues.
>
> -mca btl_openib_receive_queues S,12288,128,64,32:S,65536,128,64,32
>

Thanks for that. If I run the job over 2 x 48 cores it now works and the
performance seems reasonable (I need to do some more tuning) but when I
go up to 4 x 48 cores I'm getting the same problem:

[compute-1-7.local][[14383,1],86][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_oob.c:464:qp_create_one] error creating qp errno says Cannot allocate memory
[compute-1-7.local:18106] *** An error occurred in MPI_Isend
[compute-1-7.local:18106] *** on communicator MPI_COMM_WORLD
[compute-1-7.local:18106] *** MPI_ERR_OTHER: known error not in list
[compute-1-7.local:18106] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)

Any thoughts?

Thanks,
Rob

-- 
Robert Horton
System Administrator (Research Support) - School of Mathematical Sciences
Queen Mary, University of London
r.horton_at_[hidden]  -  +44 (0) 20 7882 7345