Interesting, the log_num_mtt and log_mtts_per_seg params where not set.
Setting them to utilise 2*8G of my RAM resulted in no change to the stalls or run time ie; (19,3) (20,2) (21,1) or (6,16).
In all cases, OpenIB runs in twice the time it takes TCP,except if I push the small message max to 64K and force short messages. Then the openib times are the same as TCP and no faster.
I'ms till at a loss as to why...
From: Paul Kapinos <kapinos_at_[hidden]>
To: Randolph Pullen <randolph_pullen_at_[hidden]>; Open MPI Users <users_at_[hidden]>
Sent: Tuesday, 28 August 2012 6:13 PM
Subject: Re: [OMPI users] Infiniband performance Problem and stalling
after reading this:
On 08/28/12 04:26, Randolph Pullen wrote:
> - On occasions it seems to stall indefinately, waiting on a single receive.
... I would make a blind guess: are you aware about IB card parameters for registered memory?
"Waiting forever" for a single operation is one of symptoms of the problem especially in 1.5.3.
P.S. the lower performance with 'big' chinks is known phenomenon, cf.
(image on bottom of the page). But the chunk size of 64k is fairly small
-- Dipl.-Inform. Paul Kapinos - High Performance Computing,
RWTH Aachen University, Center for Computing and Communication
Seffenter Weg 23, D 52074 Aachen (Germany)
Tel: +49 241/80-24915