Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] openib max_cqe
From: Brad Benton (bradford.benton_at_[hidden])
Date: 2012-07-09 12:20:01


I am running into similar issues with both Mellanox and IBM HCAs.

On a node installed with RHEL6.2 and MLNX_OFED-1.5.3-3.0.0, there is a
significant hit to locked memory when going with the device's max_cqe.
Here, for comparison's sake is the memory utilization for a simple MPI
process when using the new cq_size default, and when restricting it to 1500:

cq_size = max_cqe:
VmPeak: 348736 kB
VmSize: 348352 kB
VmLck: 292096 kB
VmHWM: 304896 kB
VmRSS: 304896 kB
VmData: 333504 kB

cq_size = 1500
VmPeak: 86720 kB
VmSize: 86336 kB
VmLck: 30080 kB
VmHWM: 42880 kB
VmRSS: 42880 kB
VmData: 71488 kB

For our Power systems using the IBM eHCA, the default value exhausts memory
and we can't even run.

--Brad

On Fri, Jul 6, 2012 at 5:21 AM, TERRY DONTJE <terry.dontje_at_[hidden]>wrote:

>
>
> On 7/5/2012 5:47 PM, Shamis, Pavel wrote:
>
> I mentioned on the call that for Mellanox devices (+OFA verbs) this resource is really cheap. Do you run mellanox hca + OFA verbs ?
>
> (I'll reply because I know Terry is offline for the rest of the day)
>
> Yes, he does.
>
> I asked because SUN used to have own verbs driver.
>
> I noticed this on a Solaris machine, I am not sure I have the same set up
> for Linux but I'll look and see if I can reproduce the same issue on Linux.
>
> --td
>
> The heart of the question: is it incorrect to assume that we'll consume (num CQE * CQE size) registered memory for each QP opened?
>
> QP or CQ ? I think you don't want to assume anything there. Verbs (user and kernel) do their own magic there.
> I think Mellanox should address this question.
>
> Regards,
> Pasha
> _______________________________________________
> devel mailing listdevel_at_[hidden]http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> --
> Terry D. Dontje | Principal Software Engineer
> Developer Tools Engineering | +1.781.442.2631
> Oracle * - Performance Technologies*
> 95 Network Drive, Burlington, MA 01803
> Email terry.dontje_at_[hidden]
>
>
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>