Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] Openib with > 32 cores per node
From: Samuel K. Gutierrez (samuel_at_[hidden])
Date: 2011-05-19 11:59:13


On May 19, 2011, at 9:37 AM, Robert Horton wrote

> On Thu, 2011-05-19 at 08:27 -0600, Samuel K. Gutierrez wrote:
>> Hi,
>> Try the following QP parameters that only use shared receive queues.
>> -mca btl_openib_receive_queues S,12288,128,64,32:S,65536,128,64,32
> Thanks for that. If I run the job over 2 x 48 cores it now works and the
> performance seems reasonable (I need to do some more tuning) but when I
> go up to 4 x 48 cores I'm getting the same problem:
> [compute-1-7.local][[14383,1],86][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_oob.c:464:qp_create_one] error creating qp errno says Cannot allocate memory
> [compute-1-7.local:18106] *** An error occurred in MPI_Isend
> [compute-1-7.local:18106] *** on communicator MPI_COMM_WORLD
> [compute-1-7.local:18106] *** MPI_ERR_OTHER: known error not in list
> [compute-1-7.local:18106] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
> Any thoughts?

How much memory does each node have? Does this happen at startup?

Try adding:

-mca btl_openib_cpc_include rdmacm

I'm not sure if your version of OFED supports this feature, but maybe using XRC may help. I **think** other tweaks are needed to get this going, but I'm not familiar with the details.

Hope that helps,

Samuel K. Gutierrez
Los Alamos National Laboratory

> Thanks,
> Rob
> --
> Robert Horton
> System Administrator (Research Support) - School of Mathematical Sciences
> Queen Mary, University of London
> r.horton_at_[hidden] - +44 (0) 20 7882 7345
> _______________________________________________
> users mailing list
> users_at_[hidden]