Open MPI version:1.7.2 on IB system.
Test: everybody sends to everybody - Irecv, Isend, Wait. In total 1024 process.
"[warn] opal_libevent2019 each event_base at once.
[warn] opal_libevent2019_event_base_loop: reentrant invocation. Only one event_base_loop can run on each event_base at once."
The problem doesn't show up with 512 ranks but only with 1024 ranks.
My guess, we still have somewhere in openib btl
blocking free list allocation that causes recursive call to progress.
Pavel (Pasha) Shamis
Computer Science Research Group
Computer Science and Math Division
Oak Ridge National Laboratory