Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] MTT parameters vor really big nodes?
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2012-11-04 06:11:04


Yevgeny -

Could Mellanox update the FAQ item about this?

Large-memory nodes are becoming more common.

On Nov 3, 2012, at 6:33 PM, Yevgeny Kliteynik wrote:

> Hi Paul,
>
> On 10/31/2012 10:22 PM, Paul Kapinos wrote:
>> Hello Yevgeny, hello all,
>>
>> Yevgeny, first of all thanks for explaining what the MTT parameters do and why there are two of them! I mean this post:
>> http://www.open-mpi.org/community/lists/devel/2012/08/11417.php
>>
>> Well, the official recommendation is "twice the RAM amount".
>>
>> And here we are: we have 2 nodes with 2 TB (that with a 'tera') RAM and a couple of nodes with 1TB, each with 4x Mellanox IB adapters. Thus we should have raised the MTT parameters in order to make up to 4 TB memory registrable.
>
> You don't really *have* to be able to register twice the available RAM.
> It's just heuristics. It depends on the application that you're running
> and fragmentation that it creates in the MTT.
>
> However:
>
>> I've tried to raise the MTT parameters in multiple combinations, but the maximum amount of registrable memory I was able to get was one TB (23 / 5). All tries to get more (24/5, 23/6 for 2 TB) lead to not responding InfiniBand HCAs.
>>
>> Is there any another limits in the kernel have to be adjusted in order to be able to register that a bunch of memory?
>
> Unfortunately, current driver has a limitation in this area so 1TB
> (23/5 values) is probably the top what the driver can do.
> IIRC, log_num_mtt can reach 26, so perhaps you can try 26/2 (same 1TB),
> and then, if it works, try 26/3 (fingers crossed), which will bring you
> to 2 TB, but I'm not sure it will work.
>
> This has already been fixed, and the fix was accepted to the upstream
> Linux kernel, so it will be included in the next OFED/MLNX_OFED versions.
>
> -- YK
>
>
>> Best,
>>
>> Paul Kapinos
>>
>>
>>
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/