Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] MTT parameters vor really big nodes?
From: Paul Kapinos (kapinos_at_[hidden])
Date: 2012-11-09 05:28:30


Yevgeny, Jeff,
I've tried 26/2 on a node with 2TB RAM - the IB cards are not reachable with
this setup.

26/3 not yet tested (it's a bit work for our admins to 'repair' a node in case
it is not reachable over the IB interface).

By now we've a couple of nodes with up to 2TB RAM running with 23/5 setup; this
seem to be the sonic barrier.

Best,

Paul

On 11/04/12 13:29, Yevgeny Kliteynik wrote:
> Hi Jeff,
>
> On 11/4/2012 1:11 PM, Jeff Squyres wrote:
>> Yevgeny -
>>
>> Could Mellanox update the FAQ item about this?
>>
>> Large-memory nodes are becoming more common.
>
> Sure. But I'd like to hear Paul's input on this first.
> Did it work with log_num_mtt=26?
> I don't have that kind of machines to test this.
>
> -- YK
>
>
>>
>> On Nov 3, 2012, at 6:33 PM, Yevgeny Kliteynik wrote:
>>
>>> Hi Paul,
>>>
>>> On 10/31/2012 10:22 PM, Paul Kapinos wrote:
>>>> Hello Yevgeny, hello all,
>>>>
>>>> Yevgeny, first of all thanks for explaining what the MTT parameters do and why there are two of them! I mean this post:
>>>> http://www.open-mpi.org/community/lists/devel/2012/08/11417.php
>>>>
>>>> Well, the official recommendation is "twice the RAM amount".
>>>>
>>>> And here we are: we have 2 nodes with 2 TB (that with a 'tera') RAM and a couple of nodes with 1TB, each with 4x Mellanox IB adapters. Thus we should have raised the MTT parameters in order to make up to 4 TB memory registrable.
>>>
>>> You don't really *have* to be able to register twice the available RAM.
>>> It's just heuristics. It depends on the application that you're running
>>> and fragmentation that it creates in the MTT.
>>>
>>> However:
>>>
>>>> I've tried to raise the MTT parameters in multiple combinations, but the maximum amount of registrable memory I was able to get was one TB (23 / 5). All tries to get more (24/5, 23/6 for 2 TB) lead to not responding InfiniBand HCAs.
>>>>
>>>> Is there any another limits in the kernel have to be adjusted in order to be able to register that a bunch of memory?
>>>
>>> Unfortunately, current driver has a limitation in this area so 1TB
>>> (23/5 values) is probably the top what the driver can do.
>>> IIRC, log_num_mtt can reach 26, so perhaps you can try 26/2 (same 1TB),
>>> and then, if it works, try 26/3 (fingers crossed), which will bring you
>>> to 2 TB, but I'm not sure it will work.
>>>
>>> This has already been fixed, and the fix was accepted to the upstream
>>> Linux kernel, so it will be included in the next OFED/MLNX_OFED versions.
>>>
>>> -- YK
>>>
>>>
>>>> Best,
>>>>
>>>> Paul Kapinos
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
RWTH Aachen University, Center for Computing and Communication
Seffenter Weg 23,  D 52074  Aachen (Germany)
Tel: +49 241/80-24915