Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] MTT parameters vor really big nodes?
From: Yevgeny Kliteynik (kliteyn_at_[hidden])
Date: 2012-11-04 07:29:22


Hi Jeff,

On 11/4/2012 1:11 PM, Jeff Squyres wrote:
> Yevgeny -
>
> Could Mellanox update the FAQ item about this?
>
> Large-memory nodes are becoming more common.

Sure. But I'd like to hear Paul's input on this first.
Did it work with log_num_mtt=26?
I don't have that kind of machines to test this.

-- YK

>
> On Nov 3, 2012, at 6:33 PM, Yevgeny Kliteynik wrote:
>
>> Hi Paul,
>>
>> On 10/31/2012 10:22 PM, Paul Kapinos wrote:
>>> Hello Yevgeny, hello all,
>>>
>>> Yevgeny, first of all thanks for explaining what the MTT parameters do and why there are two of them! I mean this post:
>>> http://www.open-mpi.org/community/lists/devel/2012/08/11417.php
>>>
>>> Well, the official recommendation is "twice the RAM amount".
>>>
>>> And here we are: we have 2 nodes with 2 TB (that with a 'tera') RAM and a couple of nodes with 1TB, each with 4x Mellanox IB adapters. Thus we should have raised the MTT parameters in order to make up to 4 TB memory registrable.
>>
>> You don't really *have* to be able to register twice the available RAM.
>> It's just heuristics. It depends on the application that you're running
>> and fragmentation that it creates in the MTT.
>>
>> However:
>>
>>> I've tried to raise the MTT parameters in multiple combinations, but the maximum amount of registrable memory I was able to get was one TB (23 / 5). All tries to get more (24/5, 23/6 for 2 TB) lead to not responding InfiniBand HCAs.
>>>
>>> Is there any another limits in the kernel have to be adjusted in order to be able to register that a bunch of memory?
>>
>> Unfortunately, current driver has a limitation in this area so 1TB
>> (23/5 values) is probably the top what the driver can do.
>> IIRC, log_num_mtt can reach 26, so perhaps you can try 26/2 (same 1TB),
>> and then, if it works, try 26/3 (fingers crossed), which will bring you
>> to 2 TB, but I'm not sure it will work.
>>
>> This has already been fixed, and the fix was accepted to the upstream
>> Linux kernel, so it will be included in the next OFED/MLNX_OFED versions.
>>
>> -- YK
>>
>>
>>> Best,
>>>
>>> Paul Kapinos
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>