Open MPI is clever and use by default multiple IB adapters, if available.
Open MPI is lazy and establish connections only iff needed.
Both is good.
We have kinda special nodes: up to 16 sockets, 128 cores, 4 boards, 4 IB cards.
The crucial thing is, that starting with v1.6.1 the latency of the very first
PingPong sample between two nodes take really a lot of time - some 100x - 200x
of usual latency. You cannot see this using usual latency benchmark(*) because
they tend to omit the first samples as "warmup phase", but we use a kinda
self-written parallel test which clearly show this (and let me to muse some days).
If Miltirail is forbidden (-mca btl_openib_max_btls 1), or if v.1.5.3 used, or
if the MPI processes are preconnected
(http://www.open-mpi.org/faq/?category=running#mpi-preconnect) there is no such
huge latency outliers for the first sample.
Well, we know about the warm-up and lazy connections.
But 200x ?!
Any comments about that is OK so?
(*) E.g. HPCC explicitely say in http://icl.cs.utk.edu/hpcc/faq/index.html#132
> Additional startup latencies are masked out by starting the measurement after
> one non-measured ping-pong.
P.S. Sorry for cross-posting to both Users and Developers, but my last questions
to Users have no reply until yet, so trying to broadcast...
Dipl.-Inform. Paul Kapinos - High Performance Computing,
RWTH Aachen University, Center for Computing and Communication
Seffenter Weg 23, D 52074 Aachen (Germany)
Tel: +49 241/80-24915